{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hierarchical time series notebook"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook contains examples of modelling hierarchical time series."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Table of contents**\n",
    "* [Hierarchical time series](#chapter1)\n",
    "* [Preparing dataset](#chapter2)\n",
    "    * [Manually setting hierarchical structure](#chapter2_1)\n",
    "        * [Convert dataset to ETNA wide format](#chapter2_1_1)\n",
    "        * [Creat HierarchicalStructure](#chapter2_1_2)\n",
    "        * [Create hierarchical dataset](#chapter2_1_3)\n",
    "    * [Hierarchical structure detection](#chapter2_2)\n",
    "        * [Prepare data in ETNA hierarchical long format](#chapter2_2_1)\n",
    "        * [Convert data to etna wide format with `to_hierarchical_dataset`](#chapter2_2_2)\n",
    "        * [Create the hierarchical dataset](#chapter2_2_3)\n",
    "* [Reconciliation methods](#chapter3)\n",
    "    * [Bottom-up approach](#chapter3_1)\n",
    "    * [Top-down approach](#chapter3_2)\n",
    "* [Exogenous variables for hierarchical forecasts](#chapter4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "\n",
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Hierarchical time series <a class=\"anchor\" id=\"chapter1\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In many applications time series have a natural level structure. Time series with such properties can be disaggregated by attributes\n",
    "from lower levels. On the other hand, this time series can be aggregated to higher levels to represent more general relations.\n",
    "The set of possible levels forms the hierarchy of time series."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Hierarchy example](assets/hierarchical_pipeline/hierarchy.png)\n",
    "*Two level hierarchical structure*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Image above represents relations between members of the hierarchy. Middle and top levels can be disaggregated using members from\n",
    "lower levels. For example\n",
    "\n",
    "$$\n",
    "y_{A,t} = y_{AA,t} + y_{AB,t}\n",
    "$$\n",
    "\n",
    "$$\n",
    "y_{t} = y_{A,t} + y_{B,t}\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In matrix notation level aggregation could be written as\n",
    "\n",
    "\\begin{equation*}\n",
    "    \\begin{bmatrix}\n",
    "        y_{A,t} \\\\\n",
    "        y_t\n",
    "    \\end{bmatrix}\n",
    "    =\n",
    "    \\begin{bmatrix}\n",
    "    1 & 1 & 0 \\\\\n",
    "    1 & 1 & 1\n",
    "    \\end{bmatrix}\n",
    "    \\begin{bmatrix}\n",
    "    y_{AA,t} \\\\ y_{AB,t} \\\\ y_{B,t}\n",
    "    \\end{bmatrix}\n",
    "    =\n",
    "    S\n",
    "    \\begin{bmatrix}\n",
    "    y_{AA,t} \\\\ y_{AB,t} \\\\ y_{B,t}\n",
    "    \\end{bmatrix}\n",
    "\\end{equation*}\n",
    "where $S$ - summing matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.Preparing dataset <a class=\"anchor\" id=\"chapter2\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the Australian tourism dataset.\n",
    "\n",
    "This dataset consists of the following components:\n",
    "\n",
    "* `Total` - total domestic tourism demand,\n",
    "* Tourism reasons components (`Hol` for holiday, `Bus` for business, etc)\n",
    "* Components representing the \"region-reason\" division (`NSW - hol`, `NSW - bus`, etc)\n",
    "* Components representing \"region - reason - city\" division (`NSW - hol - city`, `NSW - hol - noncity`, etc)\n",
    "\n",
    "We can see that these components form a hierarchy with the following levels::\n",
    "1. Total\n",
    "2. Tourism reason\n",
    "3. Region\n",
    "4. City"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "pd.options.display.max_columns = 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
      "                                 Dload  Upload   Total   Spent    Left  Speed\n",
      "\n",
      "  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\n",
      "  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\n",
      "100 15664  100 15664    0     0  17366      0 --:--:-- --:--:-- --:--:-- 17385\n"
     ]
    }
   ],
   "source": [
    "!curl \"https://robjhyndman.com/data/hier1_with_names.csv\" --ssl-no-revoke -o \"hier1_with_names.csv\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Total</th>\n",
       "      <th>Hol</th>\n",
       "      <th>VFR</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Oth</th>\n",
       "      <th>NSW - hol</th>\n",
       "      <th>VIC - hol</th>\n",
       "      <th>QLD - hol</th>\n",
       "      <th>SA - hol</th>\n",
       "      <th>WA - hol</th>\n",
       "      <th>TAS - hol</th>\n",
       "      <th>NT - hol</th>\n",
       "      <th>NSW - vfr</th>\n",
       "      <th>VIC - vfr</th>\n",
       "      <th>QLD - vfr</th>\n",
       "      <th>SA - vfr</th>\n",
       "      <th>WA - vfr</th>\n",
       "      <th>TAS - vfr</th>\n",
       "      <th>NT - vfr</th>\n",
       "      <th>NSW - bus</th>\n",
       "      <th>VIC - bus</th>\n",
       "      <th>QLD - bus</th>\n",
       "      <th>SA - bus</th>\n",
       "      <th>WA - bus</th>\n",
       "      <th>TAS - bus</th>\n",
       "      <th>NT - bus</th>\n",
       "      <th>NSW - oth</th>\n",
       "      <th>VIC - oth</th>\n",
       "      <th>QLD - oth</th>\n",
       "      <th>SA - oth</th>\n",
       "      <th>WA - oth</th>\n",
       "      <th>TAS - oth</th>\n",
       "      <th>NT - oth</th>\n",
       "      <th>NSW - hol - city</th>\n",
       "      <th>NSW - hol - noncity</th>\n",
       "      <th>VIC - hol - city</th>\n",
       "      <th>VIC - hol - noncity</th>\n",
       "      <th>QLD - hol - city</th>\n",
       "      <th>QLD - hol - noncity</th>\n",
       "      <th>SA - hol - city</th>\n",
       "      <th>SA - hol - noncity</th>\n",
       "      <th>WA - hol - city</th>\n",
       "      <th>WA - hol - noncity</th>\n",
       "      <th>TAS - hol - city</th>\n",
       "      <th>TAS - hol - noncity</th>\n",
       "      <th>NT - hol - city</th>\n",
       "      <th>NT - hol - noncity</th>\n",
       "      <th>NSW - vfr - city</th>\n",
       "      <th>NSW - vfr - noncity</th>\n",
       "      <th>VIC - vfr - city</th>\n",
       "      <th>VIC - vfr - noncity</th>\n",
       "      <th>QLD - vfr - city</th>\n",
       "      <th>QLD - vfr - noncity</th>\n",
       "      <th>SA - vfr - city</th>\n",
       "      <th>SA - vfr - noncity</th>\n",
       "      <th>WA - vfr - city</th>\n",
       "      <th>WA - vfr - noncity</th>\n",
       "      <th>TAS - vfr - city</th>\n",
       "      <th>TAS - vfr - noncity</th>\n",
       "      <th>NT - vfr - city</th>\n",
       "      <th>NT - vfr - noncity</th>\n",
       "      <th>NSW - bus - city</th>\n",
       "      <th>NSW - bus - noncity</th>\n",
       "      <th>VIC - bus - city</th>\n",
       "      <th>VIC - bus - noncity</th>\n",
       "      <th>QLD - bus - city</th>\n",
       "      <th>QLD - bus - noncity</th>\n",
       "      <th>SA - bus - city</th>\n",
       "      <th>SA - bus - noncity</th>\n",
       "      <th>WA - bus - city</th>\n",
       "      <th>WA - bus - noncity</th>\n",
       "      <th>TAS - bus - city</th>\n",
       "      <th>TAS - bus - noncity</th>\n",
       "      <th>NT - bus - city</th>\n",
       "      <th>NT - bus - noncity</th>\n",
       "      <th>NSW - oth - city</th>\n",
       "      <th>NSW - oth - noncity</th>\n",
       "      <th>VIC - oth - city</th>\n",
       "      <th>VIC - oth - noncity</th>\n",
       "      <th>QLD - oth - city</th>\n",
       "      <th>QLD - oth - noncity</th>\n",
       "      <th>SA - oth - city</th>\n",
       "      <th>SA - oth - noncity</th>\n",
       "      <th>WA - oth - city</th>\n",
       "      <th>WA - oth - noncity</th>\n",
       "      <th>TAS - oth - city</th>\n",
       "      <th>TAS - oth - noncity</th>\n",
       "      <th>NT - oth - city</th>\n",
       "      <th>NT - oth - noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>84503</td>\n",
       "      <td>45906</td>\n",
       "      <td>26042</td>\n",
       "      <td>9815</td>\n",
       "      <td>2740</td>\n",
       "      <td>17589</td>\n",
       "      <td>10412</td>\n",
       "      <td>9078</td>\n",
       "      <td>3089</td>\n",
       "      <td>3449</td>\n",
       "      <td>2102</td>\n",
       "      <td>187</td>\n",
       "      <td>9398</td>\n",
       "      <td>5993</td>\n",
       "      <td>5290</td>\n",
       "      <td>2193</td>\n",
       "      <td>1781</td>\n",
       "      <td>1350</td>\n",
       "      <td>37</td>\n",
       "      <td>2885</td>\n",
       "      <td>2148</td>\n",
       "      <td>2093</td>\n",
       "      <td>844</td>\n",
       "      <td>1406</td>\n",
       "      <td>223</td>\n",
       "      <td>216</td>\n",
       "      <td>906</td>\n",
       "      <td>467</td>\n",
       "      <td>702</td>\n",
       "      <td>317</td>\n",
       "      <td>205</td>\n",
       "      <td>100</td>\n",
       "      <td>43</td>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>65312</td>\n",
       "      <td>29347</td>\n",
       "      <td>20676</td>\n",
       "      <td>11823</td>\n",
       "      <td>3466</td>\n",
       "      <td>11027</td>\n",
       "      <td>6025</td>\n",
       "      <td>6310</td>\n",
       "      <td>1935</td>\n",
       "      <td>2454</td>\n",
       "      <td>1098</td>\n",
       "      <td>498</td>\n",
       "      <td>7829</td>\n",
       "      <td>4107</td>\n",
       "      <td>4902</td>\n",
       "      <td>1445</td>\n",
       "      <td>1353</td>\n",
       "      <td>523</td>\n",
       "      <td>517</td>\n",
       "      <td>4301</td>\n",
       "      <td>1825</td>\n",
       "      <td>2224</td>\n",
       "      <td>749</td>\n",
       "      <td>2043</td>\n",
       "      <td>373</td>\n",
       "      <td>308</td>\n",
       "      <td>1238</td>\n",
       "      <td>552</td>\n",
       "      <td>839</td>\n",
       "      <td>363</td>\n",
       "      <td>269</td>\n",
       "      <td>97</td>\n",
       "      <td>108</td>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>72753</td>\n",
       "      <td>32492</td>\n",
       "      <td>20582</td>\n",
       "      <td>13565</td>\n",
       "      <td>6114</td>\n",
       "      <td>8910</td>\n",
       "      <td>5060</td>\n",
       "      <td>11733</td>\n",
       "      <td>1569</td>\n",
       "      <td>3398</td>\n",
       "      <td>458</td>\n",
       "      <td>1364</td>\n",
       "      <td>7277</td>\n",
       "      <td>3811</td>\n",
       "      <td>5489</td>\n",
       "      <td>1453</td>\n",
       "      <td>1687</td>\n",
       "      <td>391</td>\n",
       "      <td>474</td>\n",
       "      <td>4093</td>\n",
       "      <td>1944</td>\n",
       "      <td>3379</td>\n",
       "      <td>750</td>\n",
       "      <td>1560</td>\n",
       "      <td>303</td>\n",
       "      <td>1536</td>\n",
       "      <td>1433</td>\n",
       "      <td>446</td>\n",
       "      <td>1434</td>\n",
       "      <td>712</td>\n",
       "      <td>1546</td>\n",
       "      <td>55</td>\n",
       "      <td>488</td>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>70880</td>\n",
       "      <td>31813</td>\n",
       "      <td>21613</td>\n",
       "      <td>11478</td>\n",
       "      <td>5976</td>\n",
       "      <td>10658</td>\n",
       "      <td>5481</td>\n",
       "      <td>8109</td>\n",
       "      <td>2270</td>\n",
       "      <td>3561</td>\n",
       "      <td>1320</td>\n",
       "      <td>414</td>\n",
       "      <td>8303</td>\n",
       "      <td>5090</td>\n",
       "      <td>4441</td>\n",
       "      <td>1209</td>\n",
       "      <td>1714</td>\n",
       "      <td>394</td>\n",
       "      <td>462</td>\n",
       "      <td>3463</td>\n",
       "      <td>1753</td>\n",
       "      <td>2880</td>\n",
       "      <td>890</td>\n",
       "      <td>1791</td>\n",
       "      <td>298</td>\n",
       "      <td>403</td>\n",
       "      <td>1902</td>\n",
       "      <td>606</td>\n",
       "      <td>749</td>\n",
       "      <td>454</td>\n",
       "      <td>1549</td>\n",
       "      <td>91</td>\n",
       "      <td>625</td>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>86893</td>\n",
       "      <td>46793</td>\n",
       "      <td>26947</td>\n",
       "      <td>10027</td>\n",
       "      <td>3126</td>\n",
       "      <td>16152</td>\n",
       "      <td>10958</td>\n",
       "      <td>10047</td>\n",
       "      <td>3023</td>\n",
       "      <td>4287</td>\n",
       "      <td>2113</td>\n",
       "      <td>213</td>\n",
       "      <td>10386</td>\n",
       "      <td>6152</td>\n",
       "      <td>5636</td>\n",
       "      <td>1685</td>\n",
       "      <td>2026</td>\n",
       "      <td>784</td>\n",
       "      <td>278</td>\n",
       "      <td>3347</td>\n",
       "      <td>1522</td>\n",
       "      <td>2751</td>\n",
       "      <td>666</td>\n",
       "      <td>1023</td>\n",
       "      <td>335</td>\n",
       "      <td>383</td>\n",
       "      <td>984</td>\n",
       "      <td>558</td>\n",
       "      <td>1015</td>\n",
       "      <td>180</td>\n",
       "      <td>190</td>\n",
       "      <td>137</td>\n",
       "      <td>62</td>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Total    Hol    VFR    Bus   Oth  NSW - hol  VIC - hol  QLD - hol  \\\n",
       "timestamp                                                                       \n",
       "2006-01-01  84503  45906  26042   9815  2740      17589      10412       9078   \n",
       "2006-02-01  65312  29347  20676  11823  3466      11027       6025       6310   \n",
       "2006-03-01  72753  32492  20582  13565  6114       8910       5060      11733   \n",
       "2006-04-01  70880  31813  21613  11478  5976      10658       5481       8109   \n",
       "2006-05-01  86893  46793  26947  10027  3126      16152      10958      10047   \n",
       "\n",
       "            SA - hol  WA - hol  TAS - hol  NT - hol  NSW - vfr  VIC - vfr  \\\n",
       "timestamp                                                                   \n",
       "2006-01-01      3089      3449       2102       187       9398       5993   \n",
       "2006-02-01      1935      2454       1098       498       7829       4107   \n",
       "2006-03-01      1569      3398        458      1364       7277       3811   \n",
       "2006-04-01      2270      3561       1320       414       8303       5090   \n",
       "2006-05-01      3023      4287       2113       213      10386       6152   \n",
       "\n",
       "            QLD - vfr  SA - vfr  WA - vfr  TAS - vfr  NT - vfr  NSW - bus  \\\n",
       "timestamp                                                                   \n",
       "2006-01-01       5290      2193      1781       1350        37       2885   \n",
       "2006-02-01       4902      1445      1353        523       517       4301   \n",
       "2006-03-01       5489      1453      1687        391       474       4093   \n",
       "2006-04-01       4441      1209      1714        394       462       3463   \n",
       "2006-05-01       5636      1685      2026        784       278       3347   \n",
       "\n",
       "            VIC - bus  QLD - bus  SA - bus  WA - bus  TAS - bus  NT - bus  \\\n",
       "timestamp                                                                   \n",
       "2006-01-01       2148       2093       844      1406        223       216   \n",
       "2006-02-01       1825       2224       749      2043        373       308   \n",
       "2006-03-01       1944       3379       750      1560        303      1536   \n",
       "2006-04-01       1753       2880       890      1791        298       403   \n",
       "2006-05-01       1522       2751       666      1023        335       383   \n",
       "\n",
       "            NSW - oth  VIC - oth  QLD - oth  SA - oth  WA - oth  TAS - oth  \\\n",
       "timestamp                                                                    \n",
       "2006-01-01        906        467        702       317       205        100   \n",
       "2006-02-01       1238        552        839       363       269         97   \n",
       "2006-03-01       1433        446       1434       712      1546         55   \n",
       "2006-04-01       1902        606        749       454      1549         91   \n",
       "2006-05-01        984        558       1015       180       190        137   \n",
       "\n",
       "            NT - oth  NSW - hol - city  NSW - hol - noncity  VIC - hol - city  \\\n",
       "timestamp                                                                       \n",
       "2006-01-01        43              3096                14493              2531   \n",
       "2006-02-01       108              1479                 9548              1439   \n",
       "2006-03-01       488              1609                 7301              1488   \n",
       "2006-04-01       625              1520                 9138              1906   \n",
       "2006-05-01        62              1958                14194              2517   \n",
       "\n",
       "            VIC - hol - noncity  QLD - hol - city  QLD - hol - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                 7881              4688                 4390   \n",
       "2006-02-01                 4586              2320                 3990   \n",
       "2006-03-01                 3572              4758                 6975   \n",
       "2006-04-01                 3575              3328                 4781   \n",
       "2006-05-01                 8441              4930                 5117   \n",
       "\n",
       "            SA - hol - city  SA - hol - noncity  WA - hol - city  \\\n",
       "timestamp                                                          \n",
       "2006-01-01              888                2201             1383   \n",
       "2006-02-01              521                1414             1059   \n",
       "2006-03-01              476                1093             1101   \n",
       "2006-04-01              571                1699             1128   \n",
       "2006-05-01              873                2150             1560   \n",
       "\n",
       "            WA - hol - noncity  TAS - hol - city  TAS - hol - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                2066               619                 1483   \n",
       "2006-02-01                1395               409                  689   \n",
       "2006-03-01                2297               127                  331   \n",
       "2006-04-01                2433               371                  949   \n",
       "2006-05-01                2727               523                 1590   \n",
       "\n",
       "            NT - hol - city  NT - hol - noncity  NSW - vfr - city  \\\n",
       "timestamp                                                           \n",
       "2006-01-01              101                  86              2709   \n",
       "2006-02-01              201                 297              2184   \n",
       "2006-03-01              619                 745              2225   \n",
       "2006-04-01              164                 250              2918   \n",
       "2006-05-01               62                 151              3154   \n",
       "\n",
       "            NSW - vfr - noncity  VIC - vfr - city  VIC - vfr - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                 6689              2565                 3428   \n",
       "2006-02-01                 5645              1852                 2255   \n",
       "2006-03-01                 5052              1882                 1929   \n",
       "2006-04-01                 5385              2208                 2882   \n",
       "2006-05-01                 7232              2988                 3164   \n",
       "\n",
       "            QLD - vfr - city  QLD - vfr - noncity  SA - vfr - city  \\\n",
       "timestamp                                                            \n",
       "2006-01-01              3003                 2287             1324   \n",
       "2006-02-01              1957                 2945              806   \n",
       "2006-03-01              2619                 2870             1078   \n",
       "2006-04-01              2097                 2344              568   \n",
       "2006-05-01              2703                 2933              887   \n",
       "\n",
       "            SA - vfr - noncity  WA - vfr - city  WA - vfr - noncity  \\\n",
       "timestamp                                                             \n",
       "2006-01-01                 869             1019                 762   \n",
       "2006-02-01                 639              750                 603   \n",
       "2006-03-01                 375              953                 734   \n",
       "2006-04-01                 641              999                 715   \n",
       "2006-05-01                 798             1396                 630   \n",
       "\n",
       "            TAS - vfr - city  TAS - vfr - noncity  NT - vfr - city  \\\n",
       "timestamp                                                            \n",
       "2006-01-01               602                  748               28   \n",
       "2006-02-01               257                  266              168   \n",
       "2006-03-01               130                  261              390   \n",
       "2006-04-01               137                  257              244   \n",
       "2006-05-01               347                  437              153   \n",
       "\n",
       "            NT - vfr - noncity  NSW - bus - city  NSW - bus - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                   9              1201                 1684   \n",
       "2006-02-01                 349              2020                 2281   \n",
       "2006-03-01                  84              1975                 2118   \n",
       "2006-04-01                 218              1500                 1963   \n",
       "2006-05-01                 125              1196                 2151   \n",
       "\n",
       "            VIC - bus - city  VIC - bus - noncity  QLD - bus - city  \\\n",
       "timestamp                                                             \n",
       "2006-01-01              1164                  984              1111   \n",
       "2006-02-01              1014                  811               776   \n",
       "2006-03-01              1153                  791              1079   \n",
       "2006-04-01              1245                  508              1128   \n",
       "2006-05-01               950                  572              1192   \n",
       "\n",
       "            QLD - bus - noncity  SA - bus - city  SA - bus - noncity  \\\n",
       "timestamp                                                              \n",
       "2006-01-01                  982              388                 456   \n",
       "2006-02-01                 1448              346                 403   \n",
       "2006-03-01                 2300              390                 360   \n",
       "2006-04-01                 1752              255                 635   \n",
       "2006-05-01                 1559              386                 280   \n",
       "\n",
       "            WA - bus - city  WA - bus - noncity  TAS - bus - city  \\\n",
       "timestamp                                                           \n",
       "2006-01-01              532                 874               116   \n",
       "2006-02-01              356                1687                83   \n",
       "2006-03-01              440                1120               196   \n",
       "2006-04-01              539                1252                70   \n",
       "2006-05-01              582                 441               130   \n",
       "\n",
       "            TAS - bus - noncity  NT - bus - city  NT - bus - noncity  \\\n",
       "timestamp                                                              \n",
       "2006-01-01                  107              136                  80   \n",
       "2006-02-01                  290              138                 170   \n",
       "2006-03-01                  107              452                1084   \n",
       "2006-04-01                  228              243                 160   \n",
       "2006-05-01                  205              194                 189   \n",
       "\n",
       "            NSW - oth - city  NSW - oth - noncity  VIC - oth - city  \\\n",
       "timestamp                                                             \n",
       "2006-01-01               396                  510               181   \n",
       "2006-02-01               657                  581               229   \n",
       "2006-03-01               540                  893               128   \n",
       "2006-04-01               745                 1157               270   \n",
       "2006-05-01               426                  558               265   \n",
       "\n",
       "            VIC - oth - noncity  QLD - oth - city  QLD - oth - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                  286               431                  271   \n",
       "2006-02-01                  323               669                  170   \n",
       "2006-03-01                  318               270                 1164   \n",
       "2006-04-01                  336               214                  535   \n",
       "2006-05-01                  293               458                  557   \n",
       "\n",
       "            SA - oth - city  SA - oth - noncity  WA - oth - city  \\\n",
       "timestamp                                                          \n",
       "2006-01-01              244                  73              168   \n",
       "2006-02-01              142                 221              170   \n",
       "2006-03-01              397                 315              380   \n",
       "2006-04-01              194                 260              410   \n",
       "2006-05-01              147                  33              162   \n",
       "\n",
       "            WA - oth - noncity  TAS - oth - city  TAS - oth - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                  37                76                   24   \n",
       "2006-02-01                  99                36                   61   \n",
       "2006-03-01                1166                32                   23   \n",
       "2006-04-01                1139                48                   43   \n",
       "2006-05-01                  28                77                   60   \n",
       "\n",
       "            NT - oth - city  NT - oth - noncity  \n",
       "timestamp                                        \n",
       "2006-01-01               35                   8  \n",
       "2006-02-01               69                  39  \n",
       "2006-03-01              150                 338  \n",
       "2006-04-01              172                 453  \n",
       "2006-05-01               15                  47  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(\"hier1_with_names.csv\")\n",
    "\n",
    "periods = len(df)\n",
    "df[\"timestamp\"] = pd.date_range(\"2006-01-01\", periods=periods, freq=\"MS\")\n",
    "df.set_index(\"timestamp\", inplace=True)\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 Manually setting hierarchical structure <a class=\"anchor\" id=\"chapter2_1\"></a>\n",
    "This section presents how to set hierarchical structure and prepare data. We are going to create a hierarchical\n",
    "dataset with two levels: total demand and demand per tourism reason."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.datasets import TSDataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consider the **Reason** level of the hierarchy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Hol</th>\n",
       "      <th>VFR</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Oth</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>45906</td>\n",
       "      <td>26042</td>\n",
       "      <td>9815</td>\n",
       "      <td>2740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>29347</td>\n",
       "      <td>20676</td>\n",
       "      <td>11823</td>\n",
       "      <td>3466</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>32492</td>\n",
       "      <td>20582</td>\n",
       "      <td>13565</td>\n",
       "      <td>6114</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>31813</td>\n",
       "      <td>21613</td>\n",
       "      <td>11478</td>\n",
       "      <td>5976</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>46793</td>\n",
       "      <td>26947</td>\n",
       "      <td>10027</td>\n",
       "      <td>3126</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              Hol    VFR    Bus   Oth\n",
       "timestamp                            \n",
       "2006-01-01  45906  26042   9815  2740\n",
       "2006-02-01  29347  20676  11823  3466\n",
       "2006-03-01  32492  20582  13565  6114\n",
       "2006-04-01  31813  21613  11478  5976\n",
       "2006-05-01  46793  26947  10027  3126"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reason_segments = [\"Hol\", \"VFR\", \"Bus\", \"Oth\"]\n",
    "\n",
    "df[reason_segments].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1.1 Convert dataset to ETNA wide format <a class=\"anchor\" id=\"chapter2_1_1\"></a>\n",
    "First, convert dataframe to ETNA long format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>timestamp</th>\n",
       "      <th>target</th>\n",
       "      <th>segment</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2006-01-01</td>\n",
       "      <td>45906</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2006-02-01</td>\n",
       "      <td>29347</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2006-03-01</td>\n",
       "      <td>32492</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2006-04-01</td>\n",
       "      <td>31813</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2006-05-01</td>\n",
       "      <td>46793</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   timestamp  target segment\n",
       "0 2006-01-01   45906     Hol\n",
       "1 2006-02-01   29347     Hol\n",
       "2 2006-03-01   32492     Hol\n",
       "3 2006-04-01   31813     Hol\n",
       "4 2006-05-01   46793     Hol"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_df = []\n",
    "for segment_name in reason_segments:\n",
    "    segment = df[segment_name]\n",
    "\n",
    "    segment_slice = pd.DataFrame(\n",
    "        {\"timestamp\": segment.index, \"target\": segment.values, \"segment\": [segment_name] * periods}\n",
    "    )\n",
    "    hierarchical_df.append(segment_slice)\n",
    "\n",
    "hierarchical_df = pd.concat(hierarchical_df, axis=0)\n",
    "\n",
    "hierarchical_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, the dataframe could be converted to ETNA wide format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "hierarchical_df = TSDataset.to_dataset(df=hierarchical_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1.2 Creat HierarchicalStructure <a class=\"anchor\" id=\"chapter2_1_2\"></a>\n",
    "For handling information about hierarchical structure, there is a dedicated object in the ETNA library: `HierarchicalStructure`.\n",
    "\n",
    "To create `HierarchicalStructure` define relationships between segments at different levels. This relation should be\n",
    "described as mapping between levels members, where keys are parent segments and values are lists of child segments\n",
    "from the lower level. Also provide a list of level names, where ordering corresponds to hierarchical relationships\n",
    "between levels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.datasets import HierarchicalStructure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "HierarchicalStructure(level_structure = {'total': ['Hol', 'VFR', 'Bus', 'Oth']}, level_names = ['total', 'reason'], )"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_structure = HierarchicalStructure(\n",
    "    level_structure={\"total\": [\"Hol\", \"VFR\", \"Bus\", \"Oth\"]}, level_names=[\"total\", \"reason\"]\n",
    ")\n",
    "\n",
    "hierarchical_structure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1.3 Create hierarchical dataset <a class=\"anchor\" id=\"chapter2_1_3\"></a>\n",
    "When all the data is prepared, call the `TSDataset` constructor to create a hierarchical dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Hol</th>\n",
       "      <th>Oth</th>\n",
       "      <th>VFR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>9815</td>\n",
       "      <td>45906</td>\n",
       "      <td>2740</td>\n",
       "      <td>26042</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>11823</td>\n",
       "      <td>29347</td>\n",
       "      <td>3466</td>\n",
       "      <td>20676</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>13565</td>\n",
       "      <td>32492</td>\n",
       "      <td>6114</td>\n",
       "      <td>20582</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>11478</td>\n",
       "      <td>31813</td>\n",
       "      <td>5976</td>\n",
       "      <td>21613</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>10027</td>\n",
       "      <td>46793</td>\n",
       "      <td>3126</td>\n",
       "      <td>26947</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment       Bus    Hol    Oth    VFR\n",
       "feature    target target target target\n",
       "timestamp                             \n",
       "2006-01-01   9815  45906   2740  26042\n",
       "2006-02-01  11823  29347   3466  20676\n",
       "2006-03-01  13565  32492   6114  20582\n",
       "2006-04-01  11478  31813   5976  21613\n",
       "2006-05-01  10027  46793   3126  26947"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts = TSDataset(df=hierarchical_df, freq=\"MS\", hierarchical_structure=hierarchical_structure)\n",
    "\n",
    "hierarchical_ts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ensure that the dataset is at the desired level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'reason'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts.current_df_level"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 Hierarchical structure detection <a class=\"anchor\" id=\"chapter2_2\"></a>\n",
    "\n",
    "This section presents how to prepare data and detect hierarchical structure.\n",
    "The main advantage of this approach for creating hierarchical structures is that you don't need to define an adjacency list.\n",
    "All hierarchical relationships would be detected from the dataframe columns.\n",
    "\n",
    "The main applications for this approach are when defining the adjacency list is not desirable or when some columns of\n",
    "the dataframe already have information about hierarchy (e.g. related categorical columns).\n",
    "\n",
    "A data frame must be prepared in a specific format for detection to work. The following sections show how to do so.\n",
    "\n",
    "Consider the City level of the hierarchy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NSW - hol - city</th>\n",
       "      <th>NSW - hol - noncity</th>\n",
       "      <th>VIC - hol - city</th>\n",
       "      <th>VIC - hol - noncity</th>\n",
       "      <th>QLD - hol - city</th>\n",
       "      <th>QLD - hol - noncity</th>\n",
       "      <th>SA - hol - city</th>\n",
       "      <th>SA - hol - noncity</th>\n",
       "      <th>WA - hol - city</th>\n",
       "      <th>WA - hol - noncity</th>\n",
       "      <th>TAS - hol - city</th>\n",
       "      <th>TAS - hol - noncity</th>\n",
       "      <th>NT - hol - city</th>\n",
       "      <th>NT - hol - noncity</th>\n",
       "      <th>NSW - vfr - city</th>\n",
       "      <th>NSW - vfr - noncity</th>\n",
       "      <th>VIC - vfr - city</th>\n",
       "      <th>VIC - vfr - noncity</th>\n",
       "      <th>QLD - vfr - city</th>\n",
       "      <th>QLD - vfr - noncity</th>\n",
       "      <th>SA - vfr - city</th>\n",
       "      <th>SA - vfr - noncity</th>\n",
       "      <th>WA - vfr - city</th>\n",
       "      <th>WA - vfr - noncity</th>\n",
       "      <th>TAS - vfr - city</th>\n",
       "      <th>TAS - vfr - noncity</th>\n",
       "      <th>NT - vfr - city</th>\n",
       "      <th>NT - vfr - noncity</th>\n",
       "      <th>NSW - bus - city</th>\n",
       "      <th>NSW - bus - noncity</th>\n",
       "      <th>VIC - bus - city</th>\n",
       "      <th>VIC - bus - noncity</th>\n",
       "      <th>QLD - bus - city</th>\n",
       "      <th>QLD - bus - noncity</th>\n",
       "      <th>SA - bus - city</th>\n",
       "      <th>SA - bus - noncity</th>\n",
       "      <th>WA - bus - city</th>\n",
       "      <th>WA - bus - noncity</th>\n",
       "      <th>TAS - bus - city</th>\n",
       "      <th>TAS - bus - noncity</th>\n",
       "      <th>NT - bus - city</th>\n",
       "      <th>NT - bus - noncity</th>\n",
       "      <th>NSW - oth - city</th>\n",
       "      <th>NSW - oth - noncity</th>\n",
       "      <th>VIC - oth - city</th>\n",
       "      <th>VIC - oth - noncity</th>\n",
       "      <th>QLD - oth - city</th>\n",
       "      <th>QLD - oth - noncity</th>\n",
       "      <th>SA - oth - city</th>\n",
       "      <th>SA - oth - noncity</th>\n",
       "      <th>WA - oth - city</th>\n",
       "      <th>WA - oth - noncity</th>\n",
       "      <th>TAS - oth - city</th>\n",
       "      <th>TAS - oth - noncity</th>\n",
       "      <th>NT - oth - city</th>\n",
       "      <th>NT - oth - noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            NSW - hol - city  NSW - hol - noncity  VIC - hol - city  \\\n",
       "timestamp                                                             \n",
       "2006-01-01              3096                14493              2531   \n",
       "2006-02-01              1479                 9548              1439   \n",
       "2006-03-01              1609                 7301              1488   \n",
       "2006-04-01              1520                 9138              1906   \n",
       "2006-05-01              1958                14194              2517   \n",
       "\n",
       "            VIC - hol - noncity  QLD - hol - city  QLD - hol - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                 7881              4688                 4390   \n",
       "2006-02-01                 4586              2320                 3990   \n",
       "2006-03-01                 3572              4758                 6975   \n",
       "2006-04-01                 3575              3328                 4781   \n",
       "2006-05-01                 8441              4930                 5117   \n",
       "\n",
       "            SA - hol - city  SA - hol - noncity  WA - hol - city  \\\n",
       "timestamp                                                          \n",
       "2006-01-01              888                2201             1383   \n",
       "2006-02-01              521                1414             1059   \n",
       "2006-03-01              476                1093             1101   \n",
       "2006-04-01              571                1699             1128   \n",
       "2006-05-01              873                2150             1560   \n",
       "\n",
       "            WA - hol - noncity  TAS - hol - city  TAS - hol - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                2066               619                 1483   \n",
       "2006-02-01                1395               409                  689   \n",
       "2006-03-01                2297               127                  331   \n",
       "2006-04-01                2433               371                  949   \n",
       "2006-05-01                2727               523                 1590   \n",
       "\n",
       "            NT - hol - city  NT - hol - noncity  NSW - vfr - city  \\\n",
       "timestamp                                                           \n",
       "2006-01-01              101                  86              2709   \n",
       "2006-02-01              201                 297              2184   \n",
       "2006-03-01              619                 745              2225   \n",
       "2006-04-01              164                 250              2918   \n",
       "2006-05-01               62                 151              3154   \n",
       "\n",
       "            NSW - vfr - noncity  VIC - vfr - city  VIC - vfr - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                 6689              2565                 3428   \n",
       "2006-02-01                 5645              1852                 2255   \n",
       "2006-03-01                 5052              1882                 1929   \n",
       "2006-04-01                 5385              2208                 2882   \n",
       "2006-05-01                 7232              2988                 3164   \n",
       "\n",
       "            QLD - vfr - city  QLD - vfr - noncity  SA - vfr - city  \\\n",
       "timestamp                                                            \n",
       "2006-01-01              3003                 2287             1324   \n",
       "2006-02-01              1957                 2945              806   \n",
       "2006-03-01              2619                 2870             1078   \n",
       "2006-04-01              2097                 2344              568   \n",
       "2006-05-01              2703                 2933              887   \n",
       "\n",
       "            SA - vfr - noncity  WA - vfr - city  WA - vfr - noncity  \\\n",
       "timestamp                                                             \n",
       "2006-01-01                 869             1019                 762   \n",
       "2006-02-01                 639              750                 603   \n",
       "2006-03-01                 375              953                 734   \n",
       "2006-04-01                 641              999                 715   \n",
       "2006-05-01                 798             1396                 630   \n",
       "\n",
       "            TAS - vfr - city  TAS - vfr - noncity  NT - vfr - city  \\\n",
       "timestamp                                                            \n",
       "2006-01-01               602                  748               28   \n",
       "2006-02-01               257                  266              168   \n",
       "2006-03-01               130                  261              390   \n",
       "2006-04-01               137                  257              244   \n",
       "2006-05-01               347                  437              153   \n",
       "\n",
       "            NT - vfr - noncity  NSW - bus - city  NSW - bus - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                   9              1201                 1684   \n",
       "2006-02-01                 349              2020                 2281   \n",
       "2006-03-01                  84              1975                 2118   \n",
       "2006-04-01                 218              1500                 1963   \n",
       "2006-05-01                 125              1196                 2151   \n",
       "\n",
       "            VIC - bus - city  VIC - bus - noncity  QLD - bus - city  \\\n",
       "timestamp                                                             \n",
       "2006-01-01              1164                  984              1111   \n",
       "2006-02-01              1014                  811               776   \n",
       "2006-03-01              1153                  791              1079   \n",
       "2006-04-01              1245                  508              1128   \n",
       "2006-05-01               950                  572              1192   \n",
       "\n",
       "            QLD - bus - noncity  SA - bus - city  SA - bus - noncity  \\\n",
       "timestamp                                                              \n",
       "2006-01-01                  982              388                 456   \n",
       "2006-02-01                 1448              346                 403   \n",
       "2006-03-01                 2300              390                 360   \n",
       "2006-04-01                 1752              255                 635   \n",
       "2006-05-01                 1559              386                 280   \n",
       "\n",
       "            WA - bus - city  WA - bus - noncity  TAS - bus - city  \\\n",
       "timestamp                                                           \n",
       "2006-01-01              532                 874               116   \n",
       "2006-02-01              356                1687                83   \n",
       "2006-03-01              440                1120               196   \n",
       "2006-04-01              539                1252                70   \n",
       "2006-05-01              582                 441               130   \n",
       "\n",
       "            TAS - bus - noncity  NT - bus - city  NT - bus - noncity  \\\n",
       "timestamp                                                              \n",
       "2006-01-01                  107              136                  80   \n",
       "2006-02-01                  290              138                 170   \n",
       "2006-03-01                  107              452                1084   \n",
       "2006-04-01                  228              243                 160   \n",
       "2006-05-01                  205              194                 189   \n",
       "\n",
       "            NSW - oth - city  NSW - oth - noncity  VIC - oth - city  \\\n",
       "timestamp                                                             \n",
       "2006-01-01               396                  510               181   \n",
       "2006-02-01               657                  581               229   \n",
       "2006-03-01               540                  893               128   \n",
       "2006-04-01               745                 1157               270   \n",
       "2006-05-01               426                  558               265   \n",
       "\n",
       "            VIC - oth - noncity  QLD - oth - city  QLD - oth - noncity  \\\n",
       "timestamp                                                                \n",
       "2006-01-01                  286               431                  271   \n",
       "2006-02-01                  323               669                  170   \n",
       "2006-03-01                  318               270                 1164   \n",
       "2006-04-01                  336               214                  535   \n",
       "2006-05-01                  293               458                  557   \n",
       "\n",
       "            SA - oth - city  SA - oth - noncity  WA - oth - city  \\\n",
       "timestamp                                                          \n",
       "2006-01-01              244                  73              168   \n",
       "2006-02-01              142                 221              170   \n",
       "2006-03-01              397                 315              380   \n",
       "2006-04-01              194                 260              410   \n",
       "2006-05-01              147                  33              162   \n",
       "\n",
       "            WA - oth - noncity  TAS - oth - city  TAS - oth - noncity  \\\n",
       "timestamp                                                               \n",
       "2006-01-01                  37                76                   24   \n",
       "2006-02-01                  99                36                   61   \n",
       "2006-03-01                1166                32                   23   \n",
       "2006-04-01                1139                48                   43   \n",
       "2006-05-01                  28                77                   60   \n",
       "\n",
       "            NT - oth - city  NT - oth - noncity  \n",
       "timestamp                                        \n",
       "2006-01-01               35                   8  \n",
       "2006-02-01               69                  39  \n",
       "2006-03-01              150                 338  \n",
       "2006-04-01              172                 453  \n",
       "2006-05-01               15                  47  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "city_segments = list(filter(lambda name: name.count(\"-\") == 2, df.columns))\n",
    "\n",
    "df[city_segments].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2.1 Prepare data in ETNA hierarchical long format <a class=\"anchor\" id=\"chapter2_2_1\"></a>\n",
    "Before trying to detect a hierarchical structure, data must be transformed to hierarchical long format. In this format,\n",
    "your `DataFrame` must contain `timestamp`, `target` and level columns. Each level column represents membership of the\n",
    "observation at higher levels of the hierarchy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>timestamp</th>\n",
       "      <th>target</th>\n",
       "      <th>city_level</th>\n",
       "      <th>region_level</th>\n",
       "      <th>reason_level</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2006-01-01</td>\n",
       "      <td>3096</td>\n",
       "      <td>city</td>\n",
       "      <td>NSW</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2006-02-01</td>\n",
       "      <td>1479</td>\n",
       "      <td>city</td>\n",
       "      <td>NSW</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2006-03-01</td>\n",
       "      <td>1609</td>\n",
       "      <td>city</td>\n",
       "      <td>NSW</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2006-04-01</td>\n",
       "      <td>1520</td>\n",
       "      <td>city</td>\n",
       "      <td>NSW</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2006-05-01</td>\n",
       "      <td>1958</td>\n",
       "      <td>city</td>\n",
       "      <td>NSW</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   timestamp  target city_level region_level reason_level\n",
       "0 2006-01-01    3096       city          NSW          Hol\n",
       "1 2006-02-01    1479       city          NSW          Hol\n",
       "2 2006-03-01    1609       city          NSW          Hol\n",
       "3 2006-04-01    1520       city          NSW          Hol\n",
       "4 2006-05-01    1958       city          NSW          Hol"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_df = []\n",
    "for segment_name in city_segments:\n",
    "    segment = df[segment_name]\n",
    "    region, reason, city = segment_name.split(\" - \")\n",
    "\n",
    "    seg_df = pd.DataFrame(\n",
    "        data={\n",
    "            \"timestamp\": segment.index,\n",
    "            \"target\": segment.values,\n",
    "            \"city_level\": [city] * periods,\n",
    "            \"region_level\": [region] * periods,\n",
    "            \"reason_level\": [reason] * periods,\n",
    "        },\n",
    "    )\n",
    "    hierarchical_df.append(seg_df)\n",
    "\n",
    "hierarchical_df = pd.concat(hierarchical_df, axis=0)\n",
    "\n",
    "hierarchical_df[\"reason_level\"].replace({\"hol\": \"Hol\", \"vfr\": \"VFR\", \"bus\": \"Bus\", \"oth\": \"Oth\"}, inplace=True)\n",
    "hierarchical_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we can omit total level as it will be added automatically in hierarchy detection."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2.2 Convert data to etna wide format with `to_hierarchical_dataset`<a class=\"anchor\" id=\"chapter2_2_2\"></a>\n",
    "To detect hierarchical structure and convert `DataFrame` to ETNA wide format, call `TSDataset.to_hierarchical_dataset`,\n",
    "provided with prepared data and level columns names in order from top to bottom."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus_NSW_city</th>\n",
       "      <th>Bus_NSW_noncity</th>\n",
       "      <th>Bus_NT_city</th>\n",
       "      <th>Bus_NT_noncity</th>\n",
       "      <th>Bus_QLD_city</th>\n",
       "      <th>Bus_QLD_noncity</th>\n",
       "      <th>Bus_SA_city</th>\n",
       "      <th>Bus_SA_noncity</th>\n",
       "      <th>Bus_TAS_city</th>\n",
       "      <th>Bus_TAS_noncity</th>\n",
       "      <th>Bus_VIC_city</th>\n",
       "      <th>Bus_VIC_noncity</th>\n",
       "      <th>Bus_WA_city</th>\n",
       "      <th>Bus_WA_noncity</th>\n",
       "      <th>Hol_NSW_city</th>\n",
       "      <th>Hol_NSW_noncity</th>\n",
       "      <th>Hol_NT_city</th>\n",
       "      <th>Hol_NT_noncity</th>\n",
       "      <th>Hol_QLD_city</th>\n",
       "      <th>Hol_QLD_noncity</th>\n",
       "      <th>Hol_SA_city</th>\n",
       "      <th>Hol_SA_noncity</th>\n",
       "      <th>Hol_TAS_city</th>\n",
       "      <th>Hol_TAS_noncity</th>\n",
       "      <th>Hol_VIC_city</th>\n",
       "      <th>Hol_VIC_noncity</th>\n",
       "      <th>Hol_WA_city</th>\n",
       "      <th>Hol_WA_noncity</th>\n",
       "      <th>Oth_NSW_city</th>\n",
       "      <th>Oth_NSW_noncity</th>\n",
       "      <th>Oth_NT_city</th>\n",
       "      <th>Oth_NT_noncity</th>\n",
       "      <th>Oth_QLD_city</th>\n",
       "      <th>Oth_QLD_noncity</th>\n",
       "      <th>Oth_SA_city</th>\n",
       "      <th>Oth_SA_noncity</th>\n",
       "      <th>Oth_TAS_city</th>\n",
       "      <th>Oth_TAS_noncity</th>\n",
       "      <th>Oth_VIC_city</th>\n",
       "      <th>Oth_VIC_noncity</th>\n",
       "      <th>Oth_WA_city</th>\n",
       "      <th>Oth_WA_noncity</th>\n",
       "      <th>VFR_NSW_city</th>\n",
       "      <th>VFR_NSW_noncity</th>\n",
       "      <th>VFR_NT_city</th>\n",
       "      <th>VFR_NT_noncity</th>\n",
       "      <th>VFR_QLD_city</th>\n",
       "      <th>VFR_QLD_noncity</th>\n",
       "      <th>VFR_SA_city</th>\n",
       "      <th>VFR_SA_noncity</th>\n",
       "      <th>VFR_TAS_city</th>\n",
       "      <th>VFR_TAS_noncity</th>\n",
       "      <th>VFR_VIC_city</th>\n",
       "      <th>VFR_VIC_noncity</th>\n",
       "      <th>VFR_WA_city</th>\n",
       "      <th>VFR_WA_noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment    Bus_NSW_city Bus_NSW_noncity Bus_NT_city Bus_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1201            1684         136             80   \n",
       "2006-02-01         2020            2281         138            170   \n",
       "2006-03-01         1975            2118         452           1084   \n",
       "2006-04-01         1500            1963         243            160   \n",
       "2006-05-01         1196            2151         194            189   \n",
       "\n",
       "segment    Bus_QLD_city Bus_QLD_noncity Bus_SA_city Bus_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1111             982         388            456   \n",
       "2006-02-01          776            1448         346            403   \n",
       "2006-03-01         1079            2300         390            360   \n",
       "2006-04-01         1128            1752         255            635   \n",
       "2006-05-01         1192            1559         386            280   \n",
       "\n",
       "segment    Bus_TAS_city Bus_TAS_noncity Bus_VIC_city Bus_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01          116             107         1164             984   \n",
       "2006-02-01           83             290         1014             811   \n",
       "2006-03-01          196             107         1153             791   \n",
       "2006-04-01           70             228         1245             508   \n",
       "2006-05-01          130             205          950             572   \n",
       "\n",
       "segment    Bus_WA_city Bus_WA_noncity Hol_NSW_city Hol_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         532            874         3096           14493   \n",
       "2006-02-01         356           1687         1479            9548   \n",
       "2006-03-01         440           1120         1609            7301   \n",
       "2006-04-01         539           1252         1520            9138   \n",
       "2006-05-01         582            441         1958           14194   \n",
       "\n",
       "segment    Hol_NT_city Hol_NT_noncity Hol_QLD_city Hol_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         101             86         4688            4390   \n",
       "2006-02-01         201            297         2320            3990   \n",
       "2006-03-01         619            745         4758            6975   \n",
       "2006-04-01         164            250         3328            4781   \n",
       "2006-05-01          62            151         4930            5117   \n",
       "\n",
       "segment    Hol_SA_city Hol_SA_noncity Hol_TAS_city Hol_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         888           2201          619            1483   \n",
       "2006-02-01         521           1414          409             689   \n",
       "2006-03-01         476           1093          127             331   \n",
       "2006-04-01         571           1699          371             949   \n",
       "2006-05-01         873           2150          523            1590   \n",
       "\n",
       "segment    Hol_VIC_city Hol_VIC_noncity Hol_WA_city Hol_WA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         2531            7881        1383           2066   \n",
       "2006-02-01         1439            4586        1059           1395   \n",
       "2006-03-01         1488            3572        1101           2297   \n",
       "2006-04-01         1906            3575        1128           2433   \n",
       "2006-05-01         2517            8441        1560           2727   \n",
       "\n",
       "segment    Oth_NSW_city Oth_NSW_noncity Oth_NT_city Oth_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          396             510          35              8   \n",
       "2006-02-01          657             581          69             39   \n",
       "2006-03-01          540             893         150            338   \n",
       "2006-04-01          745            1157         172            453   \n",
       "2006-05-01          426             558          15             47   \n",
       "\n",
       "segment    Oth_QLD_city Oth_QLD_noncity Oth_SA_city Oth_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          431             271         244             73   \n",
       "2006-02-01          669             170         142            221   \n",
       "2006-03-01          270            1164         397            315   \n",
       "2006-04-01          214             535         194            260   \n",
       "2006-05-01          458             557         147             33   \n",
       "\n",
       "segment    Oth_TAS_city Oth_TAS_noncity Oth_VIC_city Oth_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01           76              24          181             286   \n",
       "2006-02-01           36              61          229             323   \n",
       "2006-03-01           32              23          128             318   \n",
       "2006-04-01           48              43          270             336   \n",
       "2006-05-01           77              60          265             293   \n",
       "\n",
       "segment    Oth_WA_city Oth_WA_noncity VFR_NSW_city VFR_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         168             37         2709            6689   \n",
       "2006-02-01         170             99         2184            5645   \n",
       "2006-03-01         380           1166         2225            5052   \n",
       "2006-04-01         410           1139         2918            5385   \n",
       "2006-05-01         162             28         3154            7232   \n",
       "\n",
       "segment    VFR_NT_city VFR_NT_noncity VFR_QLD_city VFR_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01          28              9         3003            2287   \n",
       "2006-02-01         168            349         1957            2945   \n",
       "2006-03-01         390             84         2619            2870   \n",
       "2006-04-01         244            218         2097            2344   \n",
       "2006-05-01         153            125         2703            2933   \n",
       "\n",
       "segment    VFR_SA_city VFR_SA_noncity VFR_TAS_city VFR_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01        1324            869          602             748   \n",
       "2006-02-01         806            639          257             266   \n",
       "2006-03-01        1078            375          130             261   \n",
       "2006-04-01         568            641          137             257   \n",
       "2006-05-01         887            798          347             437   \n",
       "\n",
       "segment    VFR_VIC_city VFR_VIC_noncity VFR_WA_city VFR_WA_noncity  \n",
       "feature          target          target      target         target  \n",
       "timestamp                                                           \n",
       "2006-01-01         2565            3428        1019            762  \n",
       "2006-02-01         1852            2255         750            603  \n",
       "2006-03-01         1882            1929         953            734  \n",
       "2006-04-01         2208            2882         999            715  \n",
       "2006-05-01         2988            3164        1396            630  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_df, hierarchical_structure = TSDataset.to_hierarchical_dataset(\n",
    "    df=hierarchical_df, level_columns=[\"reason_level\", \"region_level\", \"city_level\"]\n",
    ")\n",
    "\n",
    "hierarchical_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "HierarchicalStructure(level_structure = {'total': ['Hol', 'VFR', 'Bus', 'Oth'], 'Bus': ['Bus_NSW', 'Bus_VIC', 'Bus_QLD', 'Bus_SA', 'Bus_WA', 'Bus_TAS', 'Bus_NT'], 'Hol': ['Hol_NSW', 'Hol_VIC', 'Hol_QLD', 'Hol_SA', 'Hol_WA', 'Hol_TAS', 'Hol_NT'], 'Oth': ['Oth_NSW', 'Oth_VIC', 'Oth_QLD', 'Oth_SA', 'Oth_WA', 'Oth_TAS', 'Oth_NT'], 'VFR': ['VFR_NSW', 'VFR_VIC', 'VFR_QLD', 'VFR_SA', 'VFR_WA', 'VFR_TAS', 'VFR_NT'], 'Bus_NSW': ['Bus_NSW_city', 'Bus_NSW_noncity'], 'Bus_NT': ['Bus_NT_city', 'Bus_NT_noncity'], 'Bus_QLD': ['Bus_QLD_city', 'Bus_QLD_noncity'], 'Bus_SA': ['Bus_SA_city', 'Bus_SA_noncity'], 'Bus_TAS': ['Bus_TAS_city', 'Bus_TAS_noncity'], 'Bus_VIC': ['Bus_VIC_city', 'Bus_VIC_noncity'], 'Bus_WA': ['Bus_WA_city', 'Bus_WA_noncity'], 'Hol_NSW': ['Hol_NSW_city', 'Hol_NSW_noncity'], 'Hol_NT': ['Hol_NT_city', 'Hol_NT_noncity'], 'Hol_QLD': ['Hol_QLD_city', 'Hol_QLD_noncity'], 'Hol_SA': ['Hol_SA_city', 'Hol_SA_noncity'], 'Hol_TAS': ['Hol_TAS_city', 'Hol_TAS_noncity'], 'Hol_VIC': ['Hol_VIC_city', 'Hol_VIC_noncity'], 'Hol_WA': ['Hol_WA_city', 'Hol_WA_noncity'], 'Oth_NSW': ['Oth_NSW_city', 'Oth_NSW_noncity'], 'Oth_NT': ['Oth_NT_city', 'Oth_NT_noncity'], 'Oth_QLD': ['Oth_QLD_city', 'Oth_QLD_noncity'], 'Oth_SA': ['Oth_SA_city', 'Oth_SA_noncity'], 'Oth_TAS': ['Oth_TAS_city', 'Oth_TAS_noncity'], 'Oth_VIC': ['Oth_VIC_city', 'Oth_VIC_noncity'], 'Oth_WA': ['Oth_WA_city', 'Oth_WA_noncity'], 'VFR_NSW': ['VFR_NSW_city', 'VFR_NSW_noncity'], 'VFR_NT': ['VFR_NT_city', 'VFR_NT_noncity'], 'VFR_QLD': ['VFR_QLD_city', 'VFR_QLD_noncity'], 'VFR_SA': ['VFR_SA_city', 'VFR_SA_noncity'], 'VFR_TAS': ['VFR_TAS_city', 'VFR_TAS_noncity'], 'VFR_VIC': ['VFR_VIC_city', 'VFR_VIC_noncity'], 'VFR_WA': ['VFR_WA_city', 'VFR_WA_noncity']}, level_names = ['total', 'reason_level', 'region_level', 'city_level'], )"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_structure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we see that `hierarchical_structure` has a mapping between higher level segments and adjacent lower level segments."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2.3 Create the hierarchical dataset<a class=\"anchor\" id=\"chapter2_2_3\"></a>\n",
    "To convert data to `TSDataset` call the constructor and pass detected `hierarchical_structure`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus_NSW_city</th>\n",
       "      <th>Bus_NSW_noncity</th>\n",
       "      <th>Bus_NT_city</th>\n",
       "      <th>Bus_NT_noncity</th>\n",
       "      <th>Bus_QLD_city</th>\n",
       "      <th>Bus_QLD_noncity</th>\n",
       "      <th>Bus_SA_city</th>\n",
       "      <th>Bus_SA_noncity</th>\n",
       "      <th>Bus_TAS_city</th>\n",
       "      <th>Bus_TAS_noncity</th>\n",
       "      <th>Bus_VIC_city</th>\n",
       "      <th>Bus_VIC_noncity</th>\n",
       "      <th>Bus_WA_city</th>\n",
       "      <th>Bus_WA_noncity</th>\n",
       "      <th>Hol_NSW_city</th>\n",
       "      <th>Hol_NSW_noncity</th>\n",
       "      <th>Hol_NT_city</th>\n",
       "      <th>Hol_NT_noncity</th>\n",
       "      <th>Hol_QLD_city</th>\n",
       "      <th>Hol_QLD_noncity</th>\n",
       "      <th>Hol_SA_city</th>\n",
       "      <th>Hol_SA_noncity</th>\n",
       "      <th>Hol_TAS_city</th>\n",
       "      <th>Hol_TAS_noncity</th>\n",
       "      <th>Hol_VIC_city</th>\n",
       "      <th>Hol_VIC_noncity</th>\n",
       "      <th>Hol_WA_city</th>\n",
       "      <th>Hol_WA_noncity</th>\n",
       "      <th>Oth_NSW_city</th>\n",
       "      <th>Oth_NSW_noncity</th>\n",
       "      <th>Oth_NT_city</th>\n",
       "      <th>Oth_NT_noncity</th>\n",
       "      <th>Oth_QLD_city</th>\n",
       "      <th>Oth_QLD_noncity</th>\n",
       "      <th>Oth_SA_city</th>\n",
       "      <th>Oth_SA_noncity</th>\n",
       "      <th>Oth_TAS_city</th>\n",
       "      <th>Oth_TAS_noncity</th>\n",
       "      <th>Oth_VIC_city</th>\n",
       "      <th>Oth_VIC_noncity</th>\n",
       "      <th>Oth_WA_city</th>\n",
       "      <th>Oth_WA_noncity</th>\n",
       "      <th>VFR_NSW_city</th>\n",
       "      <th>VFR_NSW_noncity</th>\n",
       "      <th>VFR_NT_city</th>\n",
       "      <th>VFR_NT_noncity</th>\n",
       "      <th>VFR_QLD_city</th>\n",
       "      <th>VFR_QLD_noncity</th>\n",
       "      <th>VFR_SA_city</th>\n",
       "      <th>VFR_SA_noncity</th>\n",
       "      <th>VFR_TAS_city</th>\n",
       "      <th>VFR_TAS_noncity</th>\n",
       "      <th>VFR_VIC_city</th>\n",
       "      <th>VFR_VIC_noncity</th>\n",
       "      <th>VFR_WA_city</th>\n",
       "      <th>VFR_WA_noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment    Bus_NSW_city Bus_NSW_noncity Bus_NT_city Bus_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1201            1684         136             80   \n",
       "2006-02-01         2020            2281         138            170   \n",
       "2006-03-01         1975            2118         452           1084   \n",
       "2006-04-01         1500            1963         243            160   \n",
       "2006-05-01         1196            2151         194            189   \n",
       "\n",
       "segment    Bus_QLD_city Bus_QLD_noncity Bus_SA_city Bus_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1111             982         388            456   \n",
       "2006-02-01          776            1448         346            403   \n",
       "2006-03-01         1079            2300         390            360   \n",
       "2006-04-01         1128            1752         255            635   \n",
       "2006-05-01         1192            1559         386            280   \n",
       "\n",
       "segment    Bus_TAS_city Bus_TAS_noncity Bus_VIC_city Bus_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01          116             107         1164             984   \n",
       "2006-02-01           83             290         1014             811   \n",
       "2006-03-01          196             107         1153             791   \n",
       "2006-04-01           70             228         1245             508   \n",
       "2006-05-01          130             205          950             572   \n",
       "\n",
       "segment    Bus_WA_city Bus_WA_noncity Hol_NSW_city Hol_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         532            874         3096           14493   \n",
       "2006-02-01         356           1687         1479            9548   \n",
       "2006-03-01         440           1120         1609            7301   \n",
       "2006-04-01         539           1252         1520            9138   \n",
       "2006-05-01         582            441         1958           14194   \n",
       "\n",
       "segment    Hol_NT_city Hol_NT_noncity Hol_QLD_city Hol_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         101             86         4688            4390   \n",
       "2006-02-01         201            297         2320            3990   \n",
       "2006-03-01         619            745         4758            6975   \n",
       "2006-04-01         164            250         3328            4781   \n",
       "2006-05-01          62            151         4930            5117   \n",
       "\n",
       "segment    Hol_SA_city Hol_SA_noncity Hol_TAS_city Hol_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         888           2201          619            1483   \n",
       "2006-02-01         521           1414          409             689   \n",
       "2006-03-01         476           1093          127             331   \n",
       "2006-04-01         571           1699          371             949   \n",
       "2006-05-01         873           2150          523            1590   \n",
       "\n",
       "segment    Hol_VIC_city Hol_VIC_noncity Hol_WA_city Hol_WA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         2531            7881        1383           2066   \n",
       "2006-02-01         1439            4586        1059           1395   \n",
       "2006-03-01         1488            3572        1101           2297   \n",
       "2006-04-01         1906            3575        1128           2433   \n",
       "2006-05-01         2517            8441        1560           2727   \n",
       "\n",
       "segment    Oth_NSW_city Oth_NSW_noncity Oth_NT_city Oth_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          396             510          35              8   \n",
       "2006-02-01          657             581          69             39   \n",
       "2006-03-01          540             893         150            338   \n",
       "2006-04-01          745            1157         172            453   \n",
       "2006-05-01          426             558          15             47   \n",
       "\n",
       "segment    Oth_QLD_city Oth_QLD_noncity Oth_SA_city Oth_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          431             271         244             73   \n",
       "2006-02-01          669             170         142            221   \n",
       "2006-03-01          270            1164         397            315   \n",
       "2006-04-01          214             535         194            260   \n",
       "2006-05-01          458             557         147             33   \n",
       "\n",
       "segment    Oth_TAS_city Oth_TAS_noncity Oth_VIC_city Oth_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01           76              24          181             286   \n",
       "2006-02-01           36              61          229             323   \n",
       "2006-03-01           32              23          128             318   \n",
       "2006-04-01           48              43          270             336   \n",
       "2006-05-01           77              60          265             293   \n",
       "\n",
       "segment    Oth_WA_city Oth_WA_noncity VFR_NSW_city VFR_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         168             37         2709            6689   \n",
       "2006-02-01         170             99         2184            5645   \n",
       "2006-03-01         380           1166         2225            5052   \n",
       "2006-04-01         410           1139         2918            5385   \n",
       "2006-05-01         162             28         3154            7232   \n",
       "\n",
       "segment    VFR_NT_city VFR_NT_noncity VFR_QLD_city VFR_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01          28              9         3003            2287   \n",
       "2006-02-01         168            349         1957            2945   \n",
       "2006-03-01         390             84         2619            2870   \n",
       "2006-04-01         244            218         2097            2344   \n",
       "2006-05-01         153            125         2703            2933   \n",
       "\n",
       "segment    VFR_SA_city VFR_SA_noncity VFR_TAS_city VFR_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01        1324            869          602             748   \n",
       "2006-02-01         806            639          257             266   \n",
       "2006-03-01        1078            375          130             261   \n",
       "2006-04-01         568            641          137             257   \n",
       "2006-05-01         887            798          347             437   \n",
       "\n",
       "segment    VFR_VIC_city VFR_VIC_noncity VFR_WA_city VFR_WA_noncity  \n",
       "feature          target          target      target         target  \n",
       "timestamp                                                           \n",
       "2006-01-01         2565            3428        1019            762  \n",
       "2006-02-01         1852            2255         750            603  \n",
       "2006-03-01         1882            1929         953            734  \n",
       "2006-04-01         2208            2882         999            715  \n",
       "2006-05-01         2988            3164        1396            630  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts = TSDataset(df=hierarchical_df, freq=\"MS\", hierarchical_structure=hierarchical_structure)\n",
    "hierarchical_ts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now the dataset converted to hierarchical. We can examine what hierarchical levels were detected with the following code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['total', 'reason_level', 'region_level', 'city_level']"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts.hierarchical_structure.level_names"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ensure that the dataset is at the desired level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'city_level'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts.current_df_level"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hierarchical dataset could be aggregated to higher levels with the `get_level_dataset` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Hol</th>\n",
       "      <th>VFR</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Oth</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>45906</td>\n",
       "      <td>26042</td>\n",
       "      <td>9815</td>\n",
       "      <td>2740</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>29347</td>\n",
       "      <td>20676</td>\n",
       "      <td>11823</td>\n",
       "      <td>3466</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>32492</td>\n",
       "      <td>20582</td>\n",
       "      <td>13565</td>\n",
       "      <td>6114</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>31813</td>\n",
       "      <td>21613</td>\n",
       "      <td>11478</td>\n",
       "      <td>5976</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>46793</td>\n",
       "      <td>26947</td>\n",
       "      <td>10027</td>\n",
       "      <td>3126</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment       Hol    VFR    Bus    Oth\n",
       "feature    target target target target\n",
       "timestamp                             \n",
       "2006-01-01  45906  26042   9815   2740\n",
       "2006-02-01  29347  20676  11823   3466\n",
       "2006-03-01  32492  20582  13565   6114\n",
       "2006-04-01  31813  21613  11478   5976\n",
       "2006-05-01  46793  26947  10027   3126"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts.get_level_dataset(target_level=\"reason_level\").head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Reconciliation methods <a class=\"anchor\" id=\"chapter3\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section we will examine the reconciliation methods available in ETNA.\n",
    "Hierarchical time series reconciliation allows for the readjustment of predictions produced by separate models on\n",
    "a set of hierarchically linked time series in order to make the forecasts more accurate, and ensure that they are coherent.\n",
    "\n",
    "There are several reconciliation methods in the ETNA library:\n",
    "* Bottom-up approach\n",
    "* Top-down approach\n",
    "\n",
    "Middle-out reconciliation approach could be viewed as a composition of bottom-up and top-down approaches. This method could\n",
    "be implemented using functionality from the library. For aggregation to higher levels, one could use provided bottom-up methods,\n",
    "and for disaggregation to lower levels -- top-down methods.\n",
    "\n",
    "Reconciliation methods estimate mapping matrices to perform transitions between levels. These matrices are sparse.\n",
    "ETNA uses `scipy.sparse.csr_matrix` to efficiently store them and perform computation.\n",
    "\n",
    "More detailed information about this and other reconciliation methods can be found [here](https://otexts.com/fpp2/hierarchical.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.transforms import LagTransform\n",
    "from etna.transforms import OneHotEncoderTransform\n",
    "from etna.models import LinearPerSegmentModel\n",
    "from etna.metrics import SMAPE\n",
    "from etna.pipeline import HierarchicalPipeline\n",
    "from etna.pipeline import Pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some important definitions:\n",
    "* **source level** - level of the hierarchy for model estimation, reconciliation applied to this level\n",
    "* **target level** - desired level of the hierarchy, after reconciliation we want to have series at this level."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.1. Bottom-up approach <a class=\"anchor\" id=\"chapter3_1\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The main idea of this approach is to produce forecasts for time series at lower hierarchical levels and then perform\n",
    "aggregation to higher levels.\n",
    "\n",
    "$$\n",
    "\\hat y_{A,h} = \\hat y_{AA,h} + \\hat y_{AB,h}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\hat y_{B,h} = \\hat y_{BA,h} + \\hat y_{BB,h} + \\hat y_{BC,h}\n",
    "$$\n",
    "\n",
    "where $h$ - forecast horizon."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In matrix notation:\n",
    "\n",
    "\\begin{equation*}\n",
    "    \\begin{bmatrix}\n",
    "    \\hat y_{A,h} \\\\ \\hat y_{B,h}\n",
    "    \\end{bmatrix}\n",
    "    =\n",
    "    \\begin{bmatrix}\n",
    "    1 & 1 & 0 & 0 & 0 \\\\\n",
    "    0 & 0 & 1 & 1 & 1\n",
    "    \\end{bmatrix}\n",
    "    \\begin{bmatrix}\n",
    "    \\hat y_{AA,h} \\\\ \\hat y_{AB,h} \\\\ \\hat y_{BA,h} \\\\ \\hat y_{BB,h} \\\\ \\hat y_{BC,h}\n",
    "    \\end{bmatrix}\n",
    "\\end{equation*}\n",
    "\n",
    "An advantage of this approach is that we are forecasting at the bottom-level of a structure and are able to capture\n",
    "all the patterns of the individual series. On the other hand, bottom-level data can be quite noisy and more challenging\n",
    "to model and forecast."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.reconciliation import BottomUpReconciliator"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To create `BottomUpReconciliator` specify \"source\" and \"target\" levels for aggregation. Make sure that the source\n",
    "level is lower in the hierarchy than the target level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "reconciliator = BottomUpReconciliator(target_level=\"region_level\", source_level=\"city_level\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next step is mapping matrix estimation. To do so pass hierarchical dataset to `fit` method. Current dataset level\n",
    "should be lower or equal to source level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 1, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 1, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       ...,\n",
       "       [0, 0, 0, ..., 0, 0, 0],\n",
       "       [0, 0, 0, ..., 1, 0, 0],\n",
       "       [0, 0, 0, ..., 0, 1, 1]])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reconciliator.fit(ts=hierarchical_ts)\n",
    "reconciliator.mapping_matrix.toarray()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now `reconciliator` is ready to perform aggregation to target level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus_NSW_city</th>\n",
       "      <th>Bus_NSW_noncity</th>\n",
       "      <th>Bus_NT_city</th>\n",
       "      <th>Bus_NT_noncity</th>\n",
       "      <th>Bus_QLD_city</th>\n",
       "      <th>Bus_QLD_noncity</th>\n",
       "      <th>Bus_SA_city</th>\n",
       "      <th>Bus_SA_noncity</th>\n",
       "      <th>Bus_TAS_city</th>\n",
       "      <th>Bus_TAS_noncity</th>\n",
       "      <th>Bus_VIC_city</th>\n",
       "      <th>Bus_VIC_noncity</th>\n",
       "      <th>Bus_WA_city</th>\n",
       "      <th>Bus_WA_noncity</th>\n",
       "      <th>Hol_NSW_city</th>\n",
       "      <th>Hol_NSW_noncity</th>\n",
       "      <th>Hol_NT_city</th>\n",
       "      <th>Hol_NT_noncity</th>\n",
       "      <th>Hol_QLD_city</th>\n",
       "      <th>Hol_QLD_noncity</th>\n",
       "      <th>Hol_SA_city</th>\n",
       "      <th>Hol_SA_noncity</th>\n",
       "      <th>Hol_TAS_city</th>\n",
       "      <th>Hol_TAS_noncity</th>\n",
       "      <th>Hol_VIC_city</th>\n",
       "      <th>Hol_VIC_noncity</th>\n",
       "      <th>Hol_WA_city</th>\n",
       "      <th>Hol_WA_noncity</th>\n",
       "      <th>Oth_NSW_city</th>\n",
       "      <th>Oth_NSW_noncity</th>\n",
       "      <th>Oth_NT_city</th>\n",
       "      <th>Oth_NT_noncity</th>\n",
       "      <th>Oth_QLD_city</th>\n",
       "      <th>Oth_QLD_noncity</th>\n",
       "      <th>Oth_SA_city</th>\n",
       "      <th>Oth_SA_noncity</th>\n",
       "      <th>Oth_TAS_city</th>\n",
       "      <th>Oth_TAS_noncity</th>\n",
       "      <th>Oth_VIC_city</th>\n",
       "      <th>Oth_VIC_noncity</th>\n",
       "      <th>Oth_WA_city</th>\n",
       "      <th>Oth_WA_noncity</th>\n",
       "      <th>VFR_NSW_city</th>\n",
       "      <th>VFR_NSW_noncity</th>\n",
       "      <th>VFR_NT_city</th>\n",
       "      <th>VFR_NT_noncity</th>\n",
       "      <th>VFR_QLD_city</th>\n",
       "      <th>VFR_QLD_noncity</th>\n",
       "      <th>VFR_SA_city</th>\n",
       "      <th>VFR_SA_noncity</th>\n",
       "      <th>VFR_TAS_city</th>\n",
       "      <th>VFR_TAS_noncity</th>\n",
       "      <th>VFR_VIC_city</th>\n",
       "      <th>VFR_VIC_noncity</th>\n",
       "      <th>VFR_WA_city</th>\n",
       "      <th>VFR_WA_noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment    Bus_NSW_city Bus_NSW_noncity Bus_NT_city Bus_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1201            1684         136             80   \n",
       "2006-02-01         2020            2281         138            170   \n",
       "2006-03-01         1975            2118         452           1084   \n",
       "2006-04-01         1500            1963         243            160   \n",
       "2006-05-01         1196            2151         194            189   \n",
       "\n",
       "segment    Bus_QLD_city Bus_QLD_noncity Bus_SA_city Bus_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1111             982         388            456   \n",
       "2006-02-01          776            1448         346            403   \n",
       "2006-03-01         1079            2300         390            360   \n",
       "2006-04-01         1128            1752         255            635   \n",
       "2006-05-01         1192            1559         386            280   \n",
       "\n",
       "segment    Bus_TAS_city Bus_TAS_noncity Bus_VIC_city Bus_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01          116             107         1164             984   \n",
       "2006-02-01           83             290         1014             811   \n",
       "2006-03-01          196             107         1153             791   \n",
       "2006-04-01           70             228         1245             508   \n",
       "2006-05-01          130             205          950             572   \n",
       "\n",
       "segment    Bus_WA_city Bus_WA_noncity Hol_NSW_city Hol_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         532            874         3096           14493   \n",
       "2006-02-01         356           1687         1479            9548   \n",
       "2006-03-01         440           1120         1609            7301   \n",
       "2006-04-01         539           1252         1520            9138   \n",
       "2006-05-01         582            441         1958           14194   \n",
       "\n",
       "segment    Hol_NT_city Hol_NT_noncity Hol_QLD_city Hol_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         101             86         4688            4390   \n",
       "2006-02-01         201            297         2320            3990   \n",
       "2006-03-01         619            745         4758            6975   \n",
       "2006-04-01         164            250         3328            4781   \n",
       "2006-05-01          62            151         4930            5117   \n",
       "\n",
       "segment    Hol_SA_city Hol_SA_noncity Hol_TAS_city Hol_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         888           2201          619            1483   \n",
       "2006-02-01         521           1414          409             689   \n",
       "2006-03-01         476           1093          127             331   \n",
       "2006-04-01         571           1699          371             949   \n",
       "2006-05-01         873           2150          523            1590   \n",
       "\n",
       "segment    Hol_VIC_city Hol_VIC_noncity Hol_WA_city Hol_WA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         2531            7881        1383           2066   \n",
       "2006-02-01         1439            4586        1059           1395   \n",
       "2006-03-01         1488            3572        1101           2297   \n",
       "2006-04-01         1906            3575        1128           2433   \n",
       "2006-05-01         2517            8441        1560           2727   \n",
       "\n",
       "segment    Oth_NSW_city Oth_NSW_noncity Oth_NT_city Oth_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          396             510          35              8   \n",
       "2006-02-01          657             581          69             39   \n",
       "2006-03-01          540             893         150            338   \n",
       "2006-04-01          745            1157         172            453   \n",
       "2006-05-01          426             558          15             47   \n",
       "\n",
       "segment    Oth_QLD_city Oth_QLD_noncity Oth_SA_city Oth_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          431             271         244             73   \n",
       "2006-02-01          669             170         142            221   \n",
       "2006-03-01          270            1164         397            315   \n",
       "2006-04-01          214             535         194            260   \n",
       "2006-05-01          458             557         147             33   \n",
       "\n",
       "segment    Oth_TAS_city Oth_TAS_noncity Oth_VIC_city Oth_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01           76              24          181             286   \n",
       "2006-02-01           36              61          229             323   \n",
       "2006-03-01           32              23          128             318   \n",
       "2006-04-01           48              43          270             336   \n",
       "2006-05-01           77              60          265             293   \n",
       "\n",
       "segment    Oth_WA_city Oth_WA_noncity VFR_NSW_city VFR_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         168             37         2709            6689   \n",
       "2006-02-01         170             99         2184            5645   \n",
       "2006-03-01         380           1166         2225            5052   \n",
       "2006-04-01         410           1139         2918            5385   \n",
       "2006-05-01         162             28         3154            7232   \n",
       "\n",
       "segment    VFR_NT_city VFR_NT_noncity VFR_QLD_city VFR_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01          28              9         3003            2287   \n",
       "2006-02-01         168            349         1957            2945   \n",
       "2006-03-01         390             84         2619            2870   \n",
       "2006-04-01         244            218         2097            2344   \n",
       "2006-05-01         153            125         2703            2933   \n",
       "\n",
       "segment    VFR_SA_city VFR_SA_noncity VFR_TAS_city VFR_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01        1324            869          602             748   \n",
       "2006-02-01         806            639          257             266   \n",
       "2006-03-01        1078            375          130             261   \n",
       "2006-04-01         568            641          137             257   \n",
       "2006-05-01         887            798          347             437   \n",
       "\n",
       "segment    VFR_VIC_city VFR_VIC_noncity VFR_WA_city VFR_WA_noncity  \n",
       "feature          target          target      target         target  \n",
       "timestamp                                                           \n",
       "2006-01-01         2565            3428        1019            762  \n",
       "2006-02-01         1852            2255         750            603  \n",
       "2006-03-01         1882            1929         953            734  \n",
       "2006-04-01         2208            2882         999            715  \n",
       "2006-05-01         2988            3164        1396            630  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reconciliator.aggregate(ts=hierarchical_ts).head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`HierarchicalPipeline` provides the ability to perform forecasts reconciliation in pipeline.\n",
    "A couple of points to keep in mind while working with this type of pipeline:\n",
    "1. Transforms and model work with the dataset on the **source** level.\n",
    "2. Forecasts are automatically reconciliated to the **target** level, metrics reported for **target** level as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
      "[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.4s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.8s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    1.3s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    1.3s finished\n"
     ]
    }
   ],
   "source": [
    "pipeline = HierarchicalPipeline(\n",
    "    transforms=[\n",
    "        LagTransform(in_column=\"target\", lags=[1, 2, 3, 4, 6, 12]),\n",
    "    ],\n",
    "    model=LinearPerSegmentModel(),\n",
    "    reconciliator=BottomUpReconciliator(target_level=\"region_level\", source_level=\"city_level\"),\n",
    ")\n",
    "\n",
    "bottom_up_metrics, _, _ = pipeline.backtest(ts=hierarchical_ts, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True)\n",
    "bottom_up_metrics = bottom_up_metrics.set_index(\"segment\").add_suffix(\"_bottom_up\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2. Top-down approach <a class=\"anchor\" id=\"chapter3_2\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Top-down approach is based on the idea of generating forecasts for time series at higher hierarchy levels and then\n",
    "performing disaggregation to lower levels. This approach can be expressed with the following formula:\n",
    "\n",
    "\\begin{align*}\n",
    "\\hat y_{AA,h} = p_{AA} \\hat y_A, &&\n",
    "\\hat y_{AB,h} = p_{AB} \\hat y_A, &&\n",
    "\\hat y_{BA,h} = p_{BA} \\hat y_B, &&\n",
    "\\hat y_{BB,h} = p_{BB} \\hat y_B, &&\n",
    "\\hat y_{BC,h} = p_{BC} \\hat y_B\n",
    "\\end{align*}\n",
    "\n",
    "In matrix notations this equation could be rewritten as:\n",
    "\n",
    "\\begin{equation}\n",
    "    \\begin{bmatrix}\n",
    "    \\hat y_{AA,h} \\\\ \\hat y_{AB,h} \\\\ \\hat y_{BA,h} \\\\ \\hat y_{BB,h} \\\\ \\hat y_{BC,h}\n",
    "    \\end{bmatrix}\n",
    "    =\n",
    "    \\begin{bmatrix}\n",
    "    p_{AA} & 0 & 0 & 0 & 0 \\\\\n",
    "    0 & p_{AB} & 0 & 0 & 0 \\\\\n",
    "    0 & 0 & p_{BA} & 0 & 0 \\\\\n",
    "    0 & 0 & 0 & p_{BB} & 0 \\\\\n",
    "    0 & 0 & 0 & 0 & p_{BC} \\\\\n",
    "    \\end{bmatrix}\n",
    "    \\begin{bmatrix}\n",
    "    1 & 0 \\\\\n",
    "    1 & 0 \\\\\n",
    "    0 & 1 \\\\\n",
    "    0 & 1 \\\\\n",
    "    0 & 1 \\\\\n",
    "    \\end{bmatrix}\n",
    "    \\begin{bmatrix}\n",
    "    \\hat y_{A,h} \\\\ \\hat y_{B,h}\n",
    "    \\end{bmatrix}\n",
    "    =\n",
    "    P S^T\n",
    "    \\begin{bmatrix}\n",
    "    \\hat y_{A,h} \\\\ \\hat y_{B,h}\n",
    "    \\end{bmatrix}\n",
    "\\end{equation}\n",
    "\n",
    "The main challenge of this approach is proportions estimation.\n",
    "In ETNA library, there are two methods available:\n",
    "* Average historical proportions (AHP)\n",
    "* Proportions of the historical averages (PHA)\n",
    "\n",
    "**Average historical proportions**\n",
    "\n",
    "This method is based on averaging historical proportions:\n",
    "\\begin{equation}\n",
    "\\large p_i = \\frac{1}{n} \\sum_{t = T - n}^{T} \\frac{y_{i, t}}{y_t}\n",
    "\\end{equation}\n",
    "\n",
    "where $n$ - window size, $T$ - latest timestamp, $y_{i, t}$ - time series at the lower level, $y_t$ - time series at\n",
    "the higher level. Both $y_{i, t}$ and $y_t$ have hierarchical relationship.\n",
    "\n",
    "**Proportions of the historical averages**\n",
    "This approach uses a proportion of the averages for estimation:\n",
    "\\begin{equation}\n",
    "\\large p_i = \\sum_{t = T - n}^{T} \\frac{y_{i, t}}{n} \\Bigg / \\sum_{t = T - n}^{T} \\frac{y_t}{n}\n",
    "\\end{equation}\n",
    "\n",
    "where $n$ - window size, $T$ - latest timestamp, $y_{i, t}$ - time series at the lower level, $y_t$ - time series at\n",
    "the higher level. Both $y_{i, t}$ and $y_t$ have hierarchical relationship.\n",
    "\n",
    "Described methods require only series at the higher level for forecasting. Advantages of this approach are: simplicity and\n",
    "reliability. Loss of information is main disadvantage of the approach.\n",
    "\n",
    "This method can be useful when it is needed to forecast lower level series, but some of them have more noise.\n",
    "Aggregation to a higher level and reconciliation back helps to use more meaningful information while modeling.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.reconciliation import TopDownReconciliator"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`TopDownReconciliator` accepts four arguments in its constructor. You need to specify reconciliation levels,\n",
    "a method and a window size. First, let's look at the AHP top-down reconciliation method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "ahp_reconciliator = TopDownReconciliator(\n",
    "    target_level=\"region_level\", source_level=\"reason_level\", method=\"AHP\", period=6\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The top-down approach has slightly different dataset levels requirements in comparison to the bottom-up method.\n",
    "Here source level should be higher than the target level, and the current dataset level should not be higher\n",
    "than the target level.\n",
    "\n",
    "After all level requirements met and the reconciliator is fitted, we can obtain the mapping matrix. Note, that now\n",
    "mapping matrix contains reconciliation proportions, and not only zeros and ones.\n",
    "\n",
    "Columns of the top-down mapping matrix correspond to segments at the source level of the hierarchy, and rows to\n",
    "the segments at the target level."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.29517217, 0.        , 0.        , 0.        ],\n",
       "       [0.17522331, 0.        , 0.        , 0.        ],\n",
       "       [0.29906179, 0.        , 0.        , 0.        ],\n",
       "       [0.06509802, 0.        , 0.        , 0.        ],\n",
       "       [0.10138424, 0.        , 0.        , 0.        ],\n",
       "       [0.0348691 , 0.        , 0.        , 0.        ],\n",
       "       [0.02919136, 0.        , 0.        , 0.        ],\n",
       "       [0.        , 0.35663824, 0.        , 0.        ],\n",
       "       [0.        , 0.19596791, 0.        , 0.        ],\n",
       "       [0.        , 0.25065754, 0.        , 0.        ],\n",
       "       [0.        , 0.06313639, 0.        , 0.        ],\n",
       "       [0.        , 0.09261382, 0.        , 0.        ],\n",
       "       [0.        , 0.02383924, 0.        , 0.        ],\n",
       "       [0.        , 0.01714686, 0.        , 0.        ],\n",
       "       [0.        , 0.        , 0.29766462, 0.        ],\n",
       "       [0.        , 0.        , 0.16667059, 0.        ],\n",
       "       [0.        , 0.        , 0.27550314, 0.        ],\n",
       "       [0.        , 0.        , 0.0654707 , 0.        ],\n",
       "       [0.        , 0.        , 0.13979554, 0.        ],\n",
       "       [0.        , 0.        , 0.0245672 , 0.        ],\n",
       "       [0.        , 0.        , 0.03032821, 0.        ],\n",
       "       [0.        , 0.        , 0.        , 0.29191277],\n",
       "       [0.        , 0.        , 0.        , 0.15036933],\n",
       "       [0.        , 0.        , 0.        , 0.25667986],\n",
       "       [0.        , 0.        , 0.        , 0.09445469],\n",
       "       [0.        , 0.        , 0.        , 0.1319362 ],\n",
       "       [0.        , 0.        , 0.        , 0.03209989],\n",
       "       [0.        , 0.        , 0.        , 0.04254726]])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ahp_reconciliator.fit(ts=hierarchical_ts)\n",
    "ahp_reconciliator.mapping_matrix.toarray()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let’s fit `HierarchicalPipeline` with **AHP** method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
      "[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.1s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.3s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.5s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.5s finished\n"
     ]
    }
   ],
   "source": [
    "reconciliator = TopDownReconciliator(target_level=\"region_level\", source_level=\"reason_level\", method=\"AHP\", period=9)\n",
    "\n",
    "pipeline = HierarchicalPipeline(\n",
    "    transforms=[\n",
    "        LagTransform(in_column=\"target\", lags=[1, 2, 3, 4, 6, 12]),\n",
    "    ],\n",
    "    model=LinearPerSegmentModel(),\n",
    "    reconciliator=reconciliator,\n",
    ")\n",
    "\n",
    "ahp_metrics, _, _ = pipeline.backtest(ts=hierarchical_ts, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True)\n",
    "ahp_metrics = ahp_metrics.set_index(\"segment\").add_suffix(\"_ahp\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To use another top-down proportion estimation method pass different method name to the `TopDownReconciliator` constructor.\n",
    "Let's take a look at the PHA method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "pha_reconciliator = TopDownReconciliator(\n",
    "    target_level=\"region_level\", source_level=\"reason_level\", method=\"PHA\", period=6\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It should be noted that the fitted mapping matrix has the same structure as in the previous method, but with slightly\n",
    "different non-zero values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.29761574, 0.        , 0.        , 0.        ],\n",
       "       [0.17910202, 0.        , 0.        , 0.        ],\n",
       "       [0.29400697, 0.        , 0.        , 0.        ],\n",
       "       [0.0651224 , 0.        , 0.        , 0.        ],\n",
       "       [0.10000206, 0.        , 0.        , 0.        ],\n",
       "       [0.03596948, 0.        , 0.        , 0.        ],\n",
       "       [0.02818132, 0.        , 0.        , 0.        ],\n",
       "       [0.        , 0.35710317, 0.        , 0.        ],\n",
       "       [0.        , 0.19744442, 0.        , 0.        ],\n",
       "       [0.        , 0.24879185, 0.        , 0.        ],\n",
       "       [0.        , 0.06362301, 0.        , 0.        ],\n",
       "       [0.        , 0.09206311, 0.        , 0.        ],\n",
       "       [0.        , 0.02404128, 0.        , 0.        ],\n",
       "       [0.        , 0.01693316, 0.        , 0.        ],\n",
       "       [0.        , 0.        , 0.29730368, 0.        ],\n",
       "       [0.        , 0.        , 0.16779538, 0.        ],\n",
       "       [0.        , 0.        , 0.27544335, 0.        ],\n",
       "       [0.        , 0.        , 0.06506127, 0.        ],\n",
       "       [0.        , 0.        , 0.139399  , 0.        ],\n",
       "       [0.        , 0.        , 0.02441176, 0.        ],\n",
       "       [0.        , 0.        , 0.03058557, 0.        ],\n",
       "       [0.        , 0.        , 0.        , 0.28940705],\n",
       "       [0.        , 0.        , 0.        , 0.14772684],\n",
       "       [0.        , 0.        , 0.        , 0.26106345],\n",
       "       [0.        , 0.        , 0.        , 0.09481879],\n",
       "       [0.        , 0.        , 0.        , 0.13193001],\n",
       "       [0.        , 0.        , 0.        , 0.03034655],\n",
       "       [0.        , 0.        , 0.        , 0.04470731]])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pha_reconciliator.fit(ts=hierarchical_ts)\n",
    "pha_reconciliator.mapping_matrix.toarray()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let’s fit `HierarchicalPipeline` with **PHA** method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
      "[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.1s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.4s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.4s finished\n"
     ]
    }
   ],
   "source": [
    "reconciliator = TopDownReconciliator(target_level=\"region_level\", source_level=\"reason_level\", method=\"PHA\", period=9)\n",
    "\n",
    "pipeline = HierarchicalPipeline(\n",
    "    transforms=[\n",
    "        LagTransform(in_column=\"target\", lags=[1, 2, 3, 4, 6, 12]),\n",
    "    ],\n",
    "    model=LinearPerSegmentModel(),\n",
    "    reconciliator=reconciliator,\n",
    ")\n",
    "\n",
    "pha_metrics, _, _ = pipeline.backtest(ts=hierarchical_ts, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True)\n",
    "pha_metrics = pha_metrics.set_index(\"segment\").add_suffix(\"_pha\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, let's forecast the middle level series directly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
      "[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.2s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.5s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.8s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.8s finished\n"
     ]
    }
   ],
   "source": [
    "region_level_ts = hierarchical_ts.get_level_dataset(target_level=\"region_level\")\n",
    "\n",
    "pipeline = Pipeline(\n",
    "    transforms=[\n",
    "        LagTransform(in_column=\"target\", lags=[1, 2, 3, 4, 6, 12]),\n",
    "    ],\n",
    "    model=LinearPerSegmentModel(),\n",
    ")\n",
    "\n",
    "region_level_metric, _, _ = pipeline.backtest(ts=region_level_ts, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True)\n",
    "\n",
    "region_level_metric = region_level_metric.set_index(\"segment\").add_suffix(\"_region_level\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can take a look at metrics and compare methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>SMAPE_bottom_up</th>\n",
       "      <th>SMAPE_ahp</th>\n",
       "      <th>SMAPE_pha</th>\n",
       "      <th>SMAPE_region_level</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Bus_NSW</th>\n",
       "      <td>5.270422</td>\n",
       "      <td>6.519390</td>\n",
       "      <td>6.318020</td>\n",
       "      <td>8.002023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_NT</th>\n",
       "      <td>25.765018</td>\n",
       "      <td>15.154473</td>\n",
       "      <td>14.734894</td>\n",
       "      <td>35.648559</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_QLD</th>\n",
       "      <td>18.254162</td>\n",
       "      <td>3.727278</td>\n",
       "      <td>3.843837</td>\n",
       "      <td>5.920212</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_SA</th>\n",
       "      <td>15.282322</td>\n",
       "      <td>18.196766</td>\n",
       "      <td>18.443477</td>\n",
       "      <td>17.586339</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_TAS</th>\n",
       "      <td>30.695013</td>\n",
       "      <td>25.932555</td>\n",
       "      <td>25.145120</td>\n",
       "      <td>8.810328</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_VIC</th>\n",
       "      <td>15.116212</td>\n",
       "      <td>4.755657</td>\n",
       "      <td>4.252078</td>\n",
       "      <td>10.312053</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Bus_WA</th>\n",
       "      <td>10.009304</td>\n",
       "      <td>18.514307</td>\n",
       "      <td>18.415316</td>\n",
       "      <td>10.715275</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_NSW</th>\n",
       "      <td>14.454165</td>\n",
       "      <td>7.705629</td>\n",
       "      <td>8.011244</td>\n",
       "      <td>9.115648</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_NT</th>\n",
       "      <td>53.250687</td>\n",
       "      <td>44.949294</td>\n",
       "      <td>46.821349</td>\n",
       "      <td>17.153756</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_QLD</th>\n",
       "      <td>9.624166</td>\n",
       "      <td>8.647920</td>\n",
       "      <td>7.722205</td>\n",
       "      <td>11.364234</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_SA</th>\n",
       "      <td>8.202269</td>\n",
       "      <td>20.085900</td>\n",
       "      <td>19.786931</td>\n",
       "      <td>11.244287</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_TAS</th>\n",
       "      <td>51.592386</td>\n",
       "      <td>50.644414</td>\n",
       "      <td>51.205854</td>\n",
       "      <td>55.117682</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_VIC</th>\n",
       "      <td>7.125269</td>\n",
       "      <td>17.980484</td>\n",
       "      <td>20.270132</td>\n",
       "      <td>21.994822</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hol_WA</th>\n",
       "      <td>16.415138</td>\n",
       "      <td>13.303132</td>\n",
       "      <td>13.703019</td>\n",
       "      <td>25.802063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_NSW</th>\n",
       "      <td>29.987238</td>\n",
       "      <td>35.335283</td>\n",
       "      <td>35.113979</td>\n",
       "      <td>22.802959</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_NT</th>\n",
       "      <td>98.032493</td>\n",
       "      <td>50.694763</td>\n",
       "      <td>55.755842</td>\n",
       "      <td>48.984850</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_QLD</th>\n",
       "      <td>31.464303</td>\n",
       "      <td>26.668852</td>\n",
       "      <td>27.804644</td>\n",
       "      <td>14.136124</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_SA</th>\n",
       "      <td>24.098806</td>\n",
       "      <td>41.848523</td>\n",
       "      <td>41.911698</td>\n",
       "      <td>22.057562</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_TAS</th>\n",
       "      <td>55.187208</td>\n",
       "      <td>46.457792</td>\n",
       "      <td>44.252704</td>\n",
       "      <td>23.528327</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_VIC</th>\n",
       "      <td>31.365795</td>\n",
       "      <td>37.310906</td>\n",
       "      <td>36.372753</td>\n",
       "      <td>25.495443</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oth_WA</th>\n",
       "      <td>26.894592</td>\n",
       "      <td>23.561252</td>\n",
       "      <td>26.071981</td>\n",
       "      <td>25.078132</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_NSW</th>\n",
       "      <td>4.977585</td>\n",
       "      <td>7.088159</td>\n",
       "      <td>7.067566</td>\n",
       "      <td>8.696804</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_NT</th>\n",
       "      <td>46.565888</td>\n",
       "      <td>28.796286</td>\n",
       "      <td>29.001835</td>\n",
       "      <td>35.465418</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_QLD</th>\n",
       "      <td>12.675037</td>\n",
       "      <td>4.312979</td>\n",
       "      <td>4.370722</td>\n",
       "      <td>4.169244</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_SA</th>\n",
       "      <td>15.613376</td>\n",
       "      <td>19.780459</td>\n",
       "      <td>20.278122</td>\n",
       "      <td>24.620504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_TAS</th>\n",
       "      <td>33.182773</td>\n",
       "      <td>26.505685</td>\n",
       "      <td>29.206359</td>\n",
       "      <td>28.587697</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_VIC</th>\n",
       "      <td>9.237164</td>\n",
       "      <td>10.549981</td>\n",
       "      <td>10.226061</td>\n",
       "      <td>21.911153</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>VFR_WA</th>\n",
       "      <td>17.416115</td>\n",
       "      <td>12.329126</td>\n",
       "      <td>11.702146</td>\n",
       "      <td>3.941069</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         SMAPE_bottom_up  SMAPE_ahp  SMAPE_pha  SMAPE_region_level\n",
       "segment                                                           \n",
       "Bus_NSW         5.270422   6.519390   6.318020            8.002023\n",
       "Bus_NT         25.765018  15.154473  14.734894           35.648559\n",
       "Bus_QLD        18.254162   3.727278   3.843837            5.920212\n",
       "Bus_SA         15.282322  18.196766  18.443477           17.586339\n",
       "Bus_TAS        30.695013  25.932555  25.145120            8.810328\n",
       "Bus_VIC        15.116212   4.755657   4.252078           10.312053\n",
       "Bus_WA         10.009304  18.514307  18.415316           10.715275\n",
       "Hol_NSW        14.454165   7.705629   8.011244            9.115648\n",
       "Hol_NT         53.250687  44.949294  46.821349           17.153756\n",
       "Hol_QLD         9.624166   8.647920   7.722205           11.364234\n",
       "Hol_SA          8.202269  20.085900  19.786931           11.244287\n",
       "Hol_TAS        51.592386  50.644414  51.205854           55.117682\n",
       "Hol_VIC         7.125269  17.980484  20.270132           21.994822\n",
       "Hol_WA         16.415138  13.303132  13.703019           25.802063\n",
       "Oth_NSW        29.987238  35.335283  35.113979           22.802959\n",
       "Oth_NT         98.032493  50.694763  55.755842           48.984850\n",
       "Oth_QLD        31.464303  26.668852  27.804644           14.136124\n",
       "Oth_SA         24.098806  41.848523  41.911698           22.057562\n",
       "Oth_TAS        55.187208  46.457792  44.252704           23.528327\n",
       "Oth_VIC        31.365795  37.310906  36.372753           25.495443\n",
       "Oth_WA         26.894592  23.561252  26.071981           25.078132\n",
       "VFR_NSW         4.977585   7.088159   7.067566            8.696804\n",
       "VFR_NT         46.565888  28.796286  29.001835           35.465418\n",
       "VFR_QLD        12.675037   4.312979   4.370722            4.169244\n",
       "VFR_SA         15.613376  19.780459  20.278122           24.620504\n",
       "VFR_TAS        33.182773  26.505685  29.206359           28.587697\n",
       "VFR_VIC         9.237164  10.549981  10.226061           21.911153\n",
       "VFR_WA         17.416115  12.329126  11.702146            3.941069"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_metrics = pd.concat([bottom_up_metrics, ahp_metrics, pha_metrics, region_level_metric], axis=1)\n",
    "\n",
    "all_metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SMAPE_bottom_up       25.634104\n",
       "SMAPE_ahp             22.405616\n",
       "SMAPE_pha             22.778925\n",
       "SMAPE_region_level    19.937949\n",
       "dtype: float64"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_metrics.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results presented above show that using reconciliation methods can improve forecasting quality\n",
    "for some segments. In this particular case, the direct forecast for segments at the Reason level is slightly better\n",
    "on average than the reconciliation forecasts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Exogenous variables for hierarchical forecasts <a class=\"anchor\" id=\"chapter4\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This section shows how exogenous variables can be added to a hierarchical `TSDataset`.\n",
    "\n",
    "Before adding exogenous variables to the dataset, we should decide at which level we should place them. Model fitting and\n",
    "initial forecasting in the `HierarchicalPipeline` are made at the **source level**. So exogenous variables should be at the\n",
    "**source level** as well.\n",
    "\n",
    "Let's try to add monthly indicators to our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "from etna.datasets.utils import duplicate_data\n",
    "\n",
    "horizon = 3\n",
    "exog_index = pd.date_range(\"2006-01-01\", periods=periods + horizon, freq=\"MS\")\n",
    "\n",
    "months_df = pd.DataFrame({\"timestamp\": exog_index.values, \"month\": exog_index.month.astype(\"category\")})\n",
    "\n",
    "reason_level_segments = hierarchical_ts.hierarchical_structure.get_level_segments(level_name=\"reason_level\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Hol</th>\n",
       "      <th>Oth</th>\n",
       "      <th>VFR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment      Bus   Hol   Oth   VFR\n",
       "feature    month month month month\n",
       "timestamp                         \n",
       "2006-01-01     1     1     1     1\n",
       "2006-02-01     2     2     2     2\n",
       "2006-03-01     3     3     3     3\n",
       "2006-04-01     4     4     4     4\n",
       "2006-05-01     5     5     5     5"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "months_ts = duplicate_data(df=months_df, segments=reason_level_segments)\n",
    "months_ts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Previous block showed how to create exogenous variables and convert to a hierarchical format manually.\n",
    "Another way to convert exogenous variables to a hierarchical dataset is to use `TSDataset.to_hierarchical_dataset`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, let's convert the dataframe to hierarchical long format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>timestamp</th>\n",
       "      <th>month</th>\n",
       "      <th>reason</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2006-01-01</td>\n",
       "      <td>1</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2006-02-01</td>\n",
       "      <td>2</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2006-03-01</td>\n",
       "      <td>3</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2006-04-01</td>\n",
       "      <td>4</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2006-05-01</td>\n",
       "      <td>5</td>\n",
       "      <td>Hol</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   timestamp month reason\n",
       "0 2006-01-01     1    Hol\n",
       "1 2006-02-01     2    Hol\n",
       "2 2006-03-01     3    Hol\n",
       "3 2006-04-01     4    Hol\n",
       "4 2006-05-01     5    Hol"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "months_ts = duplicate_data(df=months_df, segments=reason_level_segments, format=\"long\")\n",
    "months_ts.rename(columns={\"segment\": \"reason\"}, inplace=True)\n",
    "months_ts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we are ready to use `to_hierarchical_dataset` method. When using this method with exogenous data\n",
    "pass `return_hierarchy=False`, because we want to use hierarchical structure from target variables.\n",
    "Setting `keep_level_columns=True` will add level columns to the result dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus</th>\n",
       "      <th>Hol</th>\n",
       "      <th>Oth</th>\n",
       "      <th>VFR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "      <th>month</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment      Bus   Hol   Oth   VFR\n",
       "feature    month month month month\n",
       "timestamp                         \n",
       "2006-01-01     1     1     1     1\n",
       "2006-02-01     2     2     2     2\n",
       "2006-03-01     3     3     3     3\n",
       "2006-04-01     4     4     4     4\n",
       "2006-05-01     5     5     5     5"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "months_ts, _ = TSDataset.to_hierarchical_dataset(df=months_ts, level_columns=[\"reason\"], return_hierarchy=False)\n",
    "months_ts.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When dataframe with exogenous variables is prepared, create new hierarchical dataset with exogenous variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "hierarchical_ts_w_exog = TSDataset(\n",
    "    df=hierarchical_df,\n",
    "    df_exog=months_ts,\n",
    "    hierarchical_structure=hierarchical_structure,\n",
    "    freq=\"MS\",\n",
    "    known_future=\"all\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'df_level=city_level, df_exog_level=reason_level'"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f\"df_level={hierarchical_ts_w_exog.current_df_level}, df_exog_level={hierarchical_ts_w_exog.current_df_exog_level}\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we can see different levels for the dataframes inside the dataset. In such case exogenous variables wouldn't be merged to target\n",
    "variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th>Bus_NSW_city</th>\n",
       "      <th>Bus_NSW_noncity</th>\n",
       "      <th>Bus_NT_city</th>\n",
       "      <th>Bus_NT_noncity</th>\n",
       "      <th>Bus_QLD_city</th>\n",
       "      <th>Bus_QLD_noncity</th>\n",
       "      <th>Bus_SA_city</th>\n",
       "      <th>Bus_SA_noncity</th>\n",
       "      <th>Bus_TAS_city</th>\n",
       "      <th>Bus_TAS_noncity</th>\n",
       "      <th>Bus_VIC_city</th>\n",
       "      <th>Bus_VIC_noncity</th>\n",
       "      <th>Bus_WA_city</th>\n",
       "      <th>Bus_WA_noncity</th>\n",
       "      <th>Hol_NSW_city</th>\n",
       "      <th>Hol_NSW_noncity</th>\n",
       "      <th>Hol_NT_city</th>\n",
       "      <th>Hol_NT_noncity</th>\n",
       "      <th>Hol_QLD_city</th>\n",
       "      <th>Hol_QLD_noncity</th>\n",
       "      <th>Hol_SA_city</th>\n",
       "      <th>Hol_SA_noncity</th>\n",
       "      <th>Hol_TAS_city</th>\n",
       "      <th>Hol_TAS_noncity</th>\n",
       "      <th>Hol_VIC_city</th>\n",
       "      <th>Hol_VIC_noncity</th>\n",
       "      <th>Hol_WA_city</th>\n",
       "      <th>Hol_WA_noncity</th>\n",
       "      <th>Oth_NSW_city</th>\n",
       "      <th>Oth_NSW_noncity</th>\n",
       "      <th>Oth_NT_city</th>\n",
       "      <th>Oth_NT_noncity</th>\n",
       "      <th>Oth_QLD_city</th>\n",
       "      <th>Oth_QLD_noncity</th>\n",
       "      <th>Oth_SA_city</th>\n",
       "      <th>Oth_SA_noncity</th>\n",
       "      <th>Oth_TAS_city</th>\n",
       "      <th>Oth_TAS_noncity</th>\n",
       "      <th>Oth_VIC_city</th>\n",
       "      <th>Oth_VIC_noncity</th>\n",
       "      <th>Oth_WA_city</th>\n",
       "      <th>Oth_WA_noncity</th>\n",
       "      <th>VFR_NSW_city</th>\n",
       "      <th>VFR_NSW_noncity</th>\n",
       "      <th>VFR_NT_city</th>\n",
       "      <th>VFR_NT_noncity</th>\n",
       "      <th>VFR_QLD_city</th>\n",
       "      <th>VFR_QLD_noncity</th>\n",
       "      <th>VFR_SA_city</th>\n",
       "      <th>VFR_SA_noncity</th>\n",
       "      <th>VFR_TAS_city</th>\n",
       "      <th>VFR_TAS_noncity</th>\n",
       "      <th>VFR_VIC_city</th>\n",
       "      <th>VFR_VIC_noncity</th>\n",
       "      <th>VFR_WA_city</th>\n",
       "      <th>VFR_WA_noncity</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1201</td>\n",
       "      <td>1684</td>\n",
       "      <td>136</td>\n",
       "      <td>80</td>\n",
       "      <td>1111</td>\n",
       "      <td>982</td>\n",
       "      <td>388</td>\n",
       "      <td>456</td>\n",
       "      <td>116</td>\n",
       "      <td>107</td>\n",
       "      <td>1164</td>\n",
       "      <td>984</td>\n",
       "      <td>532</td>\n",
       "      <td>874</td>\n",
       "      <td>3096</td>\n",
       "      <td>14493</td>\n",
       "      <td>101</td>\n",
       "      <td>86</td>\n",
       "      <td>4688</td>\n",
       "      <td>4390</td>\n",
       "      <td>888</td>\n",
       "      <td>2201</td>\n",
       "      <td>619</td>\n",
       "      <td>1483</td>\n",
       "      <td>2531</td>\n",
       "      <td>7881</td>\n",
       "      <td>1383</td>\n",
       "      <td>2066</td>\n",
       "      <td>396</td>\n",
       "      <td>510</td>\n",
       "      <td>35</td>\n",
       "      <td>8</td>\n",
       "      <td>431</td>\n",
       "      <td>271</td>\n",
       "      <td>244</td>\n",
       "      <td>73</td>\n",
       "      <td>76</td>\n",
       "      <td>24</td>\n",
       "      <td>181</td>\n",
       "      <td>286</td>\n",
       "      <td>168</td>\n",
       "      <td>37</td>\n",
       "      <td>2709</td>\n",
       "      <td>6689</td>\n",
       "      <td>28</td>\n",
       "      <td>9</td>\n",
       "      <td>3003</td>\n",
       "      <td>2287</td>\n",
       "      <td>1324</td>\n",
       "      <td>869</td>\n",
       "      <td>602</td>\n",
       "      <td>748</td>\n",
       "      <td>2565</td>\n",
       "      <td>3428</td>\n",
       "      <td>1019</td>\n",
       "      <td>762</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2020</td>\n",
       "      <td>2281</td>\n",
       "      <td>138</td>\n",
       "      <td>170</td>\n",
       "      <td>776</td>\n",
       "      <td>1448</td>\n",
       "      <td>346</td>\n",
       "      <td>403</td>\n",
       "      <td>83</td>\n",
       "      <td>290</td>\n",
       "      <td>1014</td>\n",
       "      <td>811</td>\n",
       "      <td>356</td>\n",
       "      <td>1687</td>\n",
       "      <td>1479</td>\n",
       "      <td>9548</td>\n",
       "      <td>201</td>\n",
       "      <td>297</td>\n",
       "      <td>2320</td>\n",
       "      <td>3990</td>\n",
       "      <td>521</td>\n",
       "      <td>1414</td>\n",
       "      <td>409</td>\n",
       "      <td>689</td>\n",
       "      <td>1439</td>\n",
       "      <td>4586</td>\n",
       "      <td>1059</td>\n",
       "      <td>1395</td>\n",
       "      <td>657</td>\n",
       "      <td>581</td>\n",
       "      <td>69</td>\n",
       "      <td>39</td>\n",
       "      <td>669</td>\n",
       "      <td>170</td>\n",
       "      <td>142</td>\n",
       "      <td>221</td>\n",
       "      <td>36</td>\n",
       "      <td>61</td>\n",
       "      <td>229</td>\n",
       "      <td>323</td>\n",
       "      <td>170</td>\n",
       "      <td>99</td>\n",
       "      <td>2184</td>\n",
       "      <td>5645</td>\n",
       "      <td>168</td>\n",
       "      <td>349</td>\n",
       "      <td>1957</td>\n",
       "      <td>2945</td>\n",
       "      <td>806</td>\n",
       "      <td>639</td>\n",
       "      <td>257</td>\n",
       "      <td>266</td>\n",
       "      <td>1852</td>\n",
       "      <td>2255</td>\n",
       "      <td>750</td>\n",
       "      <td>603</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>1975</td>\n",
       "      <td>2118</td>\n",
       "      <td>452</td>\n",
       "      <td>1084</td>\n",
       "      <td>1079</td>\n",
       "      <td>2300</td>\n",
       "      <td>390</td>\n",
       "      <td>360</td>\n",
       "      <td>196</td>\n",
       "      <td>107</td>\n",
       "      <td>1153</td>\n",
       "      <td>791</td>\n",
       "      <td>440</td>\n",
       "      <td>1120</td>\n",
       "      <td>1609</td>\n",
       "      <td>7301</td>\n",
       "      <td>619</td>\n",
       "      <td>745</td>\n",
       "      <td>4758</td>\n",
       "      <td>6975</td>\n",
       "      <td>476</td>\n",
       "      <td>1093</td>\n",
       "      <td>127</td>\n",
       "      <td>331</td>\n",
       "      <td>1488</td>\n",
       "      <td>3572</td>\n",
       "      <td>1101</td>\n",
       "      <td>2297</td>\n",
       "      <td>540</td>\n",
       "      <td>893</td>\n",
       "      <td>150</td>\n",
       "      <td>338</td>\n",
       "      <td>270</td>\n",
       "      <td>1164</td>\n",
       "      <td>397</td>\n",
       "      <td>315</td>\n",
       "      <td>32</td>\n",
       "      <td>23</td>\n",
       "      <td>128</td>\n",
       "      <td>318</td>\n",
       "      <td>380</td>\n",
       "      <td>1166</td>\n",
       "      <td>2225</td>\n",
       "      <td>5052</td>\n",
       "      <td>390</td>\n",
       "      <td>84</td>\n",
       "      <td>2619</td>\n",
       "      <td>2870</td>\n",
       "      <td>1078</td>\n",
       "      <td>375</td>\n",
       "      <td>130</td>\n",
       "      <td>261</td>\n",
       "      <td>1882</td>\n",
       "      <td>1929</td>\n",
       "      <td>953</td>\n",
       "      <td>734</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>1500</td>\n",
       "      <td>1963</td>\n",
       "      <td>243</td>\n",
       "      <td>160</td>\n",
       "      <td>1128</td>\n",
       "      <td>1752</td>\n",
       "      <td>255</td>\n",
       "      <td>635</td>\n",
       "      <td>70</td>\n",
       "      <td>228</td>\n",
       "      <td>1245</td>\n",
       "      <td>508</td>\n",
       "      <td>539</td>\n",
       "      <td>1252</td>\n",
       "      <td>1520</td>\n",
       "      <td>9138</td>\n",
       "      <td>164</td>\n",
       "      <td>250</td>\n",
       "      <td>3328</td>\n",
       "      <td>4781</td>\n",
       "      <td>571</td>\n",
       "      <td>1699</td>\n",
       "      <td>371</td>\n",
       "      <td>949</td>\n",
       "      <td>1906</td>\n",
       "      <td>3575</td>\n",
       "      <td>1128</td>\n",
       "      <td>2433</td>\n",
       "      <td>745</td>\n",
       "      <td>1157</td>\n",
       "      <td>172</td>\n",
       "      <td>453</td>\n",
       "      <td>214</td>\n",
       "      <td>535</td>\n",
       "      <td>194</td>\n",
       "      <td>260</td>\n",
       "      <td>48</td>\n",
       "      <td>43</td>\n",
       "      <td>270</td>\n",
       "      <td>336</td>\n",
       "      <td>410</td>\n",
       "      <td>1139</td>\n",
       "      <td>2918</td>\n",
       "      <td>5385</td>\n",
       "      <td>244</td>\n",
       "      <td>218</td>\n",
       "      <td>2097</td>\n",
       "      <td>2344</td>\n",
       "      <td>568</td>\n",
       "      <td>641</td>\n",
       "      <td>137</td>\n",
       "      <td>257</td>\n",
       "      <td>2208</td>\n",
       "      <td>2882</td>\n",
       "      <td>999</td>\n",
       "      <td>715</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>1196</td>\n",
       "      <td>2151</td>\n",
       "      <td>194</td>\n",
       "      <td>189</td>\n",
       "      <td>1192</td>\n",
       "      <td>1559</td>\n",
       "      <td>386</td>\n",
       "      <td>280</td>\n",
       "      <td>130</td>\n",
       "      <td>205</td>\n",
       "      <td>950</td>\n",
       "      <td>572</td>\n",
       "      <td>582</td>\n",
       "      <td>441</td>\n",
       "      <td>1958</td>\n",
       "      <td>14194</td>\n",
       "      <td>62</td>\n",
       "      <td>151</td>\n",
       "      <td>4930</td>\n",
       "      <td>5117</td>\n",
       "      <td>873</td>\n",
       "      <td>2150</td>\n",
       "      <td>523</td>\n",
       "      <td>1590</td>\n",
       "      <td>2517</td>\n",
       "      <td>8441</td>\n",
       "      <td>1560</td>\n",
       "      <td>2727</td>\n",
       "      <td>426</td>\n",
       "      <td>558</td>\n",
       "      <td>15</td>\n",
       "      <td>47</td>\n",
       "      <td>458</td>\n",
       "      <td>557</td>\n",
       "      <td>147</td>\n",
       "      <td>33</td>\n",
       "      <td>77</td>\n",
       "      <td>60</td>\n",
       "      <td>265</td>\n",
       "      <td>293</td>\n",
       "      <td>162</td>\n",
       "      <td>28</td>\n",
       "      <td>3154</td>\n",
       "      <td>7232</td>\n",
       "      <td>153</td>\n",
       "      <td>125</td>\n",
       "      <td>2703</td>\n",
       "      <td>2933</td>\n",
       "      <td>887</td>\n",
       "      <td>798</td>\n",
       "      <td>347</td>\n",
       "      <td>437</td>\n",
       "      <td>2988</td>\n",
       "      <td>3164</td>\n",
       "      <td>1396</td>\n",
       "      <td>630</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment    Bus_NSW_city Bus_NSW_noncity Bus_NT_city Bus_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1201            1684         136             80   \n",
       "2006-02-01         2020            2281         138            170   \n",
       "2006-03-01         1975            2118         452           1084   \n",
       "2006-04-01         1500            1963         243            160   \n",
       "2006-05-01         1196            2151         194            189   \n",
       "\n",
       "segment    Bus_QLD_city Bus_QLD_noncity Bus_SA_city Bus_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         1111             982         388            456   \n",
       "2006-02-01          776            1448         346            403   \n",
       "2006-03-01         1079            2300         390            360   \n",
       "2006-04-01         1128            1752         255            635   \n",
       "2006-05-01         1192            1559         386            280   \n",
       "\n",
       "segment    Bus_TAS_city Bus_TAS_noncity Bus_VIC_city Bus_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01          116             107         1164             984   \n",
       "2006-02-01           83             290         1014             811   \n",
       "2006-03-01          196             107         1153             791   \n",
       "2006-04-01           70             228         1245             508   \n",
       "2006-05-01          130             205          950             572   \n",
       "\n",
       "segment    Bus_WA_city Bus_WA_noncity Hol_NSW_city Hol_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         532            874         3096           14493   \n",
       "2006-02-01         356           1687         1479            9548   \n",
       "2006-03-01         440           1120         1609            7301   \n",
       "2006-04-01         539           1252         1520            9138   \n",
       "2006-05-01         582            441         1958           14194   \n",
       "\n",
       "segment    Hol_NT_city Hol_NT_noncity Hol_QLD_city Hol_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         101             86         4688            4390   \n",
       "2006-02-01         201            297         2320            3990   \n",
       "2006-03-01         619            745         4758            6975   \n",
       "2006-04-01         164            250         3328            4781   \n",
       "2006-05-01          62            151         4930            5117   \n",
       "\n",
       "segment    Hol_SA_city Hol_SA_noncity Hol_TAS_city Hol_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         888           2201          619            1483   \n",
       "2006-02-01         521           1414          409             689   \n",
       "2006-03-01         476           1093          127             331   \n",
       "2006-04-01         571           1699          371             949   \n",
       "2006-05-01         873           2150          523            1590   \n",
       "\n",
       "segment    Hol_VIC_city Hol_VIC_noncity Hol_WA_city Hol_WA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01         2531            7881        1383           2066   \n",
       "2006-02-01         1439            4586        1059           1395   \n",
       "2006-03-01         1488            3572        1101           2297   \n",
       "2006-04-01         1906            3575        1128           2433   \n",
       "2006-05-01         2517            8441        1560           2727   \n",
       "\n",
       "segment    Oth_NSW_city Oth_NSW_noncity Oth_NT_city Oth_NT_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          396             510          35              8   \n",
       "2006-02-01          657             581          69             39   \n",
       "2006-03-01          540             893         150            338   \n",
       "2006-04-01          745            1157         172            453   \n",
       "2006-05-01          426             558          15             47   \n",
       "\n",
       "segment    Oth_QLD_city Oth_QLD_noncity Oth_SA_city Oth_SA_noncity  \\\n",
       "feature          target          target      target         target   \n",
       "timestamp                                                            \n",
       "2006-01-01          431             271         244             73   \n",
       "2006-02-01          669             170         142            221   \n",
       "2006-03-01          270            1164         397            315   \n",
       "2006-04-01          214             535         194            260   \n",
       "2006-05-01          458             557         147             33   \n",
       "\n",
       "segment    Oth_TAS_city Oth_TAS_noncity Oth_VIC_city Oth_VIC_noncity  \\\n",
       "feature          target          target       target          target   \n",
       "timestamp                                                              \n",
       "2006-01-01           76              24          181             286   \n",
       "2006-02-01           36              61          229             323   \n",
       "2006-03-01           32              23          128             318   \n",
       "2006-04-01           48              43          270             336   \n",
       "2006-05-01           77              60          265             293   \n",
       "\n",
       "segment    Oth_WA_city Oth_WA_noncity VFR_NSW_city VFR_NSW_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01         168             37         2709            6689   \n",
       "2006-02-01         170             99         2184            5645   \n",
       "2006-03-01         380           1166         2225            5052   \n",
       "2006-04-01         410           1139         2918            5385   \n",
       "2006-05-01         162             28         3154            7232   \n",
       "\n",
       "segment    VFR_NT_city VFR_NT_noncity VFR_QLD_city VFR_QLD_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01          28              9         3003            2287   \n",
       "2006-02-01         168            349         1957            2945   \n",
       "2006-03-01         390             84         2619            2870   \n",
       "2006-04-01         244            218         2097            2344   \n",
       "2006-05-01         153            125         2703            2933   \n",
       "\n",
       "segment    VFR_SA_city VFR_SA_noncity VFR_TAS_city VFR_TAS_noncity  \\\n",
       "feature         target         target       target          target   \n",
       "timestamp                                                            \n",
       "2006-01-01        1324            869          602             748   \n",
       "2006-02-01         806            639          257             266   \n",
       "2006-03-01        1078            375          130             261   \n",
       "2006-04-01         568            641          137             257   \n",
       "2006-05-01         887            798          347             437   \n",
       "\n",
       "segment    VFR_VIC_city VFR_VIC_noncity VFR_WA_city VFR_WA_noncity  \n",
       "feature          target          target      target         target  \n",
       "timestamp                                                           \n",
       "2006-01-01         2565            3428        1019            762  \n",
       "2006-02-01         1852            2255         750            603  \n",
       "2006-03-01         1882            1929         953            734  \n",
       "2006-04-01         2208            2882         999            715  \n",
       "2006-05-01         2988            3164        1396            630  "
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts_w_exog.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Exogenous data will be merged only when both dataframes are at the same level, so we can perform reconciliation to do this.\n",
    "Right now, our dataset is lower, than the exogenous variables, so they aren't merged.\n",
    "To perform aggregation to higher levels, we can use `get_level_dataset` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th>segment</th>\n",
       "      <th colspan=\"2\" halign=\"left\">Bus</th>\n",
       "      <th colspan=\"2\" halign=\"left\">Hol</th>\n",
       "      <th colspan=\"2\" halign=\"left\">Oth</th>\n",
       "      <th colspan=\"2\" halign=\"left\">VFR</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feature</th>\n",
       "      <th>month</th>\n",
       "      <th>target</th>\n",
       "      <th>month</th>\n",
       "      <th>target</th>\n",
       "      <th>month</th>\n",
       "      <th>target</th>\n",
       "      <th>month</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>timestamp</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2006-01-01</th>\n",
       "      <td>1</td>\n",
       "      <td>9815.0</td>\n",
       "      <td>1</td>\n",
       "      <td>45906.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2740.0</td>\n",
       "      <td>1</td>\n",
       "      <td>26042.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-02-01</th>\n",
       "      <td>2</td>\n",
       "      <td>11823.0</td>\n",
       "      <td>2</td>\n",
       "      <td>29347.0</td>\n",
       "      <td>2</td>\n",
       "      <td>3466.0</td>\n",
       "      <td>2</td>\n",
       "      <td>20676.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-03-01</th>\n",
       "      <td>3</td>\n",
       "      <td>13565.0</td>\n",
       "      <td>3</td>\n",
       "      <td>32492.0</td>\n",
       "      <td>3</td>\n",
       "      <td>6114.0</td>\n",
       "      <td>3</td>\n",
       "      <td>20582.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-04-01</th>\n",
       "      <td>4</td>\n",
       "      <td>11478.0</td>\n",
       "      <td>4</td>\n",
       "      <td>31813.0</td>\n",
       "      <td>4</td>\n",
       "      <td>5976.0</td>\n",
       "      <td>4</td>\n",
       "      <td>21613.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2006-05-01</th>\n",
       "      <td>5</td>\n",
       "      <td>10027.0</td>\n",
       "      <td>5</td>\n",
       "      <td>46793.0</td>\n",
       "      <td>5</td>\n",
       "      <td>3126.0</td>\n",
       "      <td>5</td>\n",
       "      <td>26947.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "segment      Bus            Hol            Oth           VFR         \n",
       "feature    month   target month   target month  target month   target\n",
       "timestamp                                                            \n",
       "2006-01-01     1   9815.0     1  45906.0     1  2740.0     1  26042.0\n",
       "2006-02-01     2  11823.0     2  29347.0     2  3466.0     2  20676.0\n",
       "2006-03-01     3  13565.0     3  32492.0     3  6114.0     3  20582.0\n",
       "2006-04-01     4  11478.0     4  31813.0     4  5976.0     4  21613.0\n",
       "2006-05-01     5  10027.0     5  46793.0     5  3126.0     5  26947.0"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hierarchical_ts_w_exog.get_level_dataset(target_level=\"reason_level\").head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The modeling process stays the same as in the previous cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n",
      "[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.2s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.5s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.8s remaining:    0.0s\n",
      "[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.8s finished\n"
     ]
    }
   ],
   "source": [
    "region_level_ts_w_exog = hierarchical_ts_w_exog.get_level_dataset(target_level=\"region_level\")\n",
    "\n",
    "pipeline = HierarchicalPipeline(\n",
    "    transforms=[\n",
    "        OneHotEncoderTransform(in_column=\"month\"),\n",
    "        LagTransform(in_column=\"target\", lags=[1, 2, 3, 4, 6, 12]),\n",
    "    ],\n",
    "    model=LinearPerSegmentModel(),\n",
    "    reconciliator=TopDownReconciliator(\n",
    "        target_level=\"region_level\", source_level=\"reason_level\", period=9, method=\"AHP\"\n",
    "    ),\n",
    ")\n",
    "\n",
    "metric, _, _ = pipeline.backtest(ts=region_level_ts_w_exog, metrics=[SMAPE()], n_folds=3, aggregate_metrics=True)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
